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NATURAL RESISTANCE ASSOCIATED MACROPHAGE PROTEIN AND USES THEREOF 
FIELD OF THE INVENTION 

The present invention relates to a nucleotide sequence encoding 
a natural resistance-associated macrophage protein, the protein 
product thereof, nucleotide probes and primers thereto, 
polypeptide fragments of the protein and related antibodies. 

BACKGROUND TO THE INVENTION 

Macrophages are the main phagocytic cells of animals and play a 
key role in the immune system. Macrophages bind and in jest 
particles recognised as foreign by the immune system. Such 
particles include microorganisms. 

The three microorganisms Salmonella typhimur ium , Leishmania 
donovani and Mycobacterium bovis (BCG) are all intracellular 
pathogens of macrophages. Three separate groups of scientists 
had previously identified genes capable of controlling resistance 
and susceptibility to each of these microorganisms. The genes 
were designated respectively Ity, Lsh and Beg. Subsequent work 
has led the scientists to conclude that Ity/Lsh/Bcg is a single 
gene and is expressed at the macrophage level (Ref 1) . 

Recently, Vidal et al (Ref 2) cloned a murine gene as the most 
likely candidate to be Lsh/Ity/Bcg. This gene has been termed 
the natural resistance-associated macrophage protein (Nramp) 
gene. A cDNA for Nramp was isolated from a pre B-cell cDNA 
library and sequenced. The amino acid sequence for the protein 
product was deduced from the nucleotide sequence and predicts a 
53kDa protein. On the basis of the deduced amino acid sequence, 
Vidal et al proposed as a function of the Nramp protein the 
1 ""r=»n^r^r*- ~f r -it-rate across the membrane of the intracellular 
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the microorganisms . 

The present applicants have isolated and sequenced a macrophage- 
expressed Nramp cDNA. Contrary to the teaching of Vidal et al 
the present applicants have found a different nucleotide sequence 
including a region encoding an additional amino acid sequence at 
the N-terminus. Surprisingly, the additional amino acid sequence 
includes structural features which may be responsible for 
protein-protein interactions essential in signal transduction 
pathways thereby suggesting that Nramp controls early 
amplification of transmembrane signalling in disease resistant 
macrophages by binding the SH3 domain of tyrosine kinases or 
other molecules. 

SUMMARY OF THE INVENTION 

The present invention provides in one aspect a natural 
resistance-associated macrophage protein having an N-terminal 
region comprising an SH3 binding domain. When present in the 
macrophage, the protein is capable of controlling resistance to 
pathogenic microorganisms. 

SH3 (Src homology 3) domains are believed to mediate specific 
protein-protein interactions required in signal transduction (Ref 
3) and have been identified as related sequences in a variety of 
proteins (Refs 4 and 5) . In one embodiment of the present 
invention, the SH3 binding domain comprises the SH3 binding motif 
PGPAPQPXPXR, mere particularly PGPAPQPAPCR . This motif is found 
in the protein obtainable from mice. In another embodiment of 
the present invention, the SE3 binding domain comprises the SH3 
binding motif PXSPTSPXPXXAPPRXT , more particularly 

PTSPTSPGPQQAPPRET . This motif is found in the protein obtainable 
from humans. Typically, SH3 binding domains are rich in proline 
and sometimes serine. 
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particularly, the polypeptide segment is SPPRLSRPSYGSISSL . The 
SH3 binding domain obtainable from humans preferably further 
comprises the polypeptide segment GPQRLSGSSYGSISS . 

A further preferred feature of the N-terminal region is the 
presence of one or more consensus sequences for protein kinase 
C (PKC) phosphorylation. Preferably, the N-terminal region has 
two protein kinase C sites which flank the SH3 binding domain. 
Tyrosine residues may also flank the SH3 binding domain. 

Typically, the N-terminal region comprises 64 amino acids. 

The full amino acid sequences of the murine and human proteins 
are set out in Figure 9. Mutations or deletions may be present 
in each sequence provided that they do not substantially affect 
the activity of the protein. 

In a second aspect, the present invention provides a nucleotide 
sequence encoding the natural resistance-associated macrophage 
protein discussed above. Where the nucleotide sequence is a DNA 
sequence, this may be a genomic sequence containing introns and 
exons or a cDNA sequence obtainable from mRNA by reverse 
transcription. The SH3 binding domain of the protein obtainable 
from mice is preferably encoded by the DNA sequence comprising 
CCTGGCCCAGCACCTCAGCCAGCGCCTTGCCGG and may further comprise the 
upstream region AGCCCCCCGAGGCTGAGCAGGCCCAGTTATGGCTCCATTTCCAGCCTG . 
More particularly, the 5' end of the genomic DNA sequence is set 
out in Figure 4 and discussed in further detail below. A cDNA 
sequence is also provided, as set out in Figure 2 . The SH3 
binding domain of the protein obtainable from humans is 
preferably encoded by the DNA sequence comprising CCG ACC AGC CCG 
ACC AGC CCA GGG CCA CAG CAA GCA CCT CCC AGA GAG ACC 
and may further comprise the upstream region GGT CCC CAA AGG CTA 
AGf n~n AGC TAT GGT TCC ATC TCC AGC. 
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sequences described may still result in a functional protein 
product. Owing to the degeneracy of the genetic code it will be 
readily apparent that numerous silent mutations within the 
specified sequences will give rise to the same amino acid 
sequence . 

The cDNA sequence has been deposited as part of plasmid pBabeAS.l 
under accession number NTC 12855 at the National Collection of 
Type Cultures, Central Public Health Laboratory, London, UK. The 
deposit consisted of a culture of E. Coli DH5a transformed with 
the plasmid and was given the date of deposition of 14 January 
1994 . 

In a further aspect, the present invention provides a retroviral 
vector construct incorporating a cDNA sequence encoding the 
natural resistance-associated macrophage protein. 

Generally, the use of retroviral vectors presents a good method 
for gene transfer into haematopoietic cells and has advantages 
over other methods including stable transfer of a single copy of 
a gene into the recipient cell, and high efficiency of gene 
transfer into target cells. Vector constructs are generated by 
ligating the gene of interest using standard molecular biology 
techniques into a non-replication- competent viral genome. The 
resultant vector construct is transferred into a cell line (the 
packaging cell line) capable of replicating the viral genome and 
packaging it into infective pseudovirus particles. Depending on 
the virus envelope protein encoded by the packaging cell, the 
resulting pseudovirus particles car. be capable of infecting a 
wide host range (amphotropic) cr be restricted to rodent cells 
(ecotropic) . 

In the present application the pBabe plasmid was used as a 
suitable retroviral vector. This plasmid is discussed in further 
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marker gene which confers antibiotic resistance. Antibiotic 
resistant clones can then be tested for their ability to secrete 
functional pseudovirus particles and to infect recipient cells. 

The selected retroviral constructs of the gene carrying Nramp can 
be used as the basis for gene therapy to create a functional copy 
of the gene where lack of expression has been observed, or where 
a non- functional copy exists. The most likely method of gene 
transfer is via the bone marrow or progenitor cells isolated from 
the circulation. The gene can be retrovirally introduced into 
these stem cells removed from the patient. The stem cells would 
then be reintroduced into the patient with the aim of 
repopulating the myeloid/lymphoid cell lineages with cells 
containing a functional copy of the gene. 

In a further aspect, the present invention provides nucleotide 
probes or primers capable of hybridizing to a portion of the 
nucleotide sequence described, preferably to at least a portion 
of the sequence above which encodes the N- terminal region of the 
protein comprising or upstream of the SH3 binding domain. Probes 
can be single or double stranded and can be made by recombinant 
DNA technology from copies of the gene, or portions thereof, or 
by synthetic routes such as which lead to oligonucleotide probes. 
Primers, such as those for polymerase chain reaction work, are 
single stranded and preferably are at least 18 nucleotides long. 
Both probes and primers based on the sequence can be used at both 
DNA and RNA levels for diagnosis and such probes and primers can 
be readily made using the sequence information provided herein. 
The cDNAs as described above are themselves useful probes for the 
gene. The present applicants have found that the cDNA sequence 
of the murine gene described above can be successfully used as 
a probe for the corresponding human gene. 

The probes or primers have diagnostic potential, for example to 
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expressed but sub- functional or non-functional protein. Genetic 
defects to be diagnosed may occur in the coding sequence, or in 
the promoter or 3' untranslated regulatory regions of the gene 
and so probes directed to both genomic and cDNA sequences may be 
useful . 

Primer pairs are also provided which are capable of hybridising 
to specific sequences in the 5' region of the human NRAMP gene, 
permitting amplification of a portion of the promoter region of 
the human gene. The promoter region preferably includes a poly 
gt site, especially in the configuration t (gt) s ac (gt) 5 ac (gt) n g in 
which n=o, or an integer. The promoter region may further 
comprise a transcription site, and, optionally, one or more of: 
an Interferon- Y response element; a NFKB site; an AP-1 site; a 
W-element; a PV.l core motif; and a PEA3 site. Preferably each 
of the sites, elements or motifs are present in the order 
specified in Figure 11. Polymorphisms in the poly gt site, 
specifically located in the third cluster of gt repeats, where 
n may equal any number of repeats, typically 4 to 12, may be 
diagnostic for reduced or defective expression of the human gene 
and so primers permitting PCR amplification of this region are 
of particular importance. Probes to the promoter region are also 
provided, preferably allele-specif ic probes to the promoter 
region, for example allele-specif ic oligonucleotides. 

In a preferred embodiment, the human protein is encoded by 15 
exons , each of which is flanked by intron boundary regions. The 
exons are preferably those shown in Table 3 . Probes or primers 
are provided, which are capable of hybridising to at least a 
portion of an individual exon and/or its flanking intron boundary 
region. Preferably, primer pairs are provided which are capable 
of hybridising to the intron boundaries of each exon so as to 
amplify the respective exon. More preferably, the primer pairs 
are capable of hybridisation to any one of the intron boundary 
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PCR amplification of exons to identify such polymorphisms using 
electrophoretic techniques are of particular importance. More 
preferably, the primer pair capable of hybridisation to the 
intron boundaries 5' and 3' of human exon 2, permitting PCR 
amplification of this exon, may be critically important in 
permitting detection of a polymorphism involving a 3 amino acid 
deletion in the putative SH3 binding domain encoded within this 
exon. The importance of this region to the function of the NRAMP 
gene represents a key component of this invention. 

Antisense oligonucleotides may also be produced using the 
nucleotide sequence described above. Antisense oligonucleotides 
may be used to interrupc the expression of the gene and this 
could provide a potentially important local therapy for 
autoimmune disorders or cancers. 

As discussed in further detail below, the protein product from 
the gene is predicted to be a polytopic membrane protein. Whilst 
antibodies against the protein will be important tools in 
diagnosing levels of expression of the protein product in various 
cell populations, only those portions of the protein which are 
not occluded by membrane are likely to be accessible to 
antibodies in the intact or native protein conformation. 
Accordingly, in a further aspect of the present invention there 
is provided a polypeptide fragment of the protein which comprises 
at least a portion of the structural domain not hidden by 
membrane. Preferably, the polypeptide fragment comprises at 
least a portion of the N-tcrminal region. Two structural domains 
of potential importance are: the N-terminal cytoplasmic domain 
proximal tc the first membrane -spanning domain and comprising 
amino acids 1 to 82; and the C-terminal cytoplasmic domain distal 
to the last membrane-spanning domain and comprising amino acids 
414 to 458. 
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Nramp are ligated. The pGEX series of prokaryotic expression 
vectors is a particularly useful type of vector into which the 
Nramp sequences may be ligated. This is a standard procedure, 
further information about which may be found in (Ref 8) . 

In a further aspect, the present invention provides an antibody 
to the natural resistance-associated macrophage protein or an 
antibody to a polypeptide fragment therefrom, more particularly 
to one of the accessible polypeptide domains discussed above. 
The Nramp fusion proteins may be used as antigens to innoculate 
rabbits or rats so as to produce antibodies. Using standard 
techniques both polyclonal and monoclonal antibodies may thus be 
raised . 

In particular, antibodies recognising epitopes within specific 
amino acid sequences contained within the N- terminal 
(DKSPPRLSRPSYGSISS; PQ PAP CRET YLSEKI P I P ; and 
GTFSLRKLWAFTGPGFLMS I AFLD P GN I E S D LQ ) and C-terminal 

( WTCC IAHGATFLTHSSHKHFLYGL ) regions of the protein will recognise 
the protein in both mouse and man, and can be applied for both 
research purposes and as a diagnostic tool in man. 

In addition to diagnosis as discussed above, neutralizing 
monoclonal antibodies could be produced to block the function of 
the gene in situations where adverse effects are observed, such 
as autoimmunity or cancer resulting from expression of the gene. 

The presence or absence of the gene product could have both 
beneficial and detrimental effects depending on the disease 
status. In infectious diseases, particularly involving 

intracellular pathogens of the myeloid cell lineage, absence of 
a functional gene product may result in chronic susceptibility 
to the disease. In the case of autoimmune disorders or cancers 
of the myeloid or lymphoid cell lineages, overexpress ion of the 
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be useful for patients presenting with atypical responses to 
infection, certain autoimmune disorders, or cancers of the 
myeloid or lymphoid lineages. 

Another situation in which a deficit in the NRAMP gene might 
relate to cancer is where cancers of other cell lineages are 
destroyed by activated macrophages, through sensitivity to 
hydrogen peroxide generated by a respiratory burst response, TNF- 
a, or nitric oxide. All of these macrophage functions are 
regulated by NRAMP. In this case, corrective gene therapy via 
stem cell gene transfer would be appropriate. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will now be described further by way of 
example only with reference to the attached drawings in which: 

FIGURE 1 shows a restriction map of the Nramp X8.1 cDNA clone of 
the present invention as compared with that of Vidal et al; 

FIGURE 2 shows the sequence of macrophage-X8 . 1 Nramp in 
accordance with the present invention; 

FIGURE 3 is a schematic representation of the genomic DNA 
corresponding to nucleotides 31 to 456 of X8.1 as compared with 
the corresponding DNA of Vidal et al; 

FIGURE 4 shows the 5' sequence of the genomic DNA up to 
nucleotide 1911; 

FIGURE 5(a) shows the results of northern blot hybridizations; 
FIGURE 5 (b) shows the results of primer extension on total RNA 
from BIO.L-Lsh macrophages; 
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FIGURE 7 shows the nucleotide and deduced amino acid sequence of 
exon 2 of human NRAMP; 

FIGURE 8 shows the result of an amino acid database search with 
the N- terminal sequence for human NRAMP ; 

FIGURE 9 shows the result of a Clustal V multiple sequence 
alignment for the deduced amino acid sequence for human NRAMP, 
murine Nraiup clone X8.1 [55], and the yeast mitochondrial 
proteins SMF1 and SMF2 [3 5] ; 

FIGURE 10 (A) shows the results of amino acid database searches 

■C — 1_ . . \mxK,r> <-\ 

iui nuuian iMtt-m"ir cauji ^ ; 

FIGURE 10(B) shows the results of a Clustal V multiple sequence 
alignment for human NRAMP, mouse Nramp, SMF1 and SMF2 , and the 
expressed sequence tags [60] of Oryza sativa (rice; accession 
number dl52 68) and Arabidopsis tha.lia.na (accession number z3 053 0 ) 
genes, reading frames 1 and 2 respectively ,- 

FIGURE 11 shows the 440 bp of putative promoter region human 
NRAMP sequence 5' of the transcription start site; and 

FIGURE 12 shows two families segregating for (a) alleles 2 and 
3, or (b) alleles 1, 2 and 3 of the 5' dinucleotide repeat 
polymorphism and autoradiographs of polymorphic PCR products 
separated by denaturing polyacrylamide gel electrophoresis. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Details of Experimentation and Results 

Sequence-analysis of Nramp clones from macrophage cDNA library . 
Macrophage Nramp clones were isolated from an activated (4 h 
stimulation; 25 U/ml interferon- y , lOng/ml Salmonella typhimurium 
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corresponding to nucleotides 1410 - 1812bp of: the published (Ref 
2) sequence. Following plaque purification 35 clones from 10 b 
recombinants were analysed by PCR using sense and ant i- sense 
Nramp primers in combination with T3 and T7 vector arm primers. 
This allowed the mapping of clones with respect to the published 
sequence. 20/35 were found to be 1.0 - 1.5 kb and were not 
analysed further. The remainder were of 2.1 - 2.3 kb and 
potentially encoded full length Nramp coding sequence. Clones 
were initially restriction mapped and four selected for 
sequencing (Sequenase II) including the longest clone X8.1. 

Genomic sequencing . 

From the macrophage cDNA sequence, PCR primers were genexciLed to 
amplify a 2kb region of DNA from both yeast artificial chromosome 
(YAC clone C9C28; Princeton library) and mouse genomic DNAs . The 
products were cloned in the pCR vector (Invitrogen) and sequence 
(Sequenase II) determined from double stranded plasmid DNA from 
at least two clones of each, using oligos complementary to the 
cDNA sequence. Splice junctions were identified by comparison 
of the genomic and complementary DNA sequences. 

Northern blot and primer extension analysis of Nramp expression . 
Cytoplasmic total RNA isolated in the presence of vanadyl 
ribonucleoside complexes was utilised for denaturing gel 
electrophoresis with glyoxal and Northern blotting, or directly 
for RT reactions for primer extension analysis. Hybridizations 
were performed using probes isolated from a genomic fragment (bp 
1 - 1482) 5' of exon 3 (see results) . Restriction digestion with 
BamHI generated two probes covering a X8 . 1-specif ic ( = bp 1 - 587 
of the genomic sequence; Fig. 2b) region or the putative 5' 
untranslated sequence (= bp 588 - 1482 of the genomic sequence; 
Fig. 2b) of the published (Ref 2) cDNA. For primer extensions 
oligonucleotides designed to be specific for \8.1-like RNAs (TCT 
GCG CTG GGA ATG GGG ; bp 538-521 of the genomic sequene) or for 
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with 25 Units of AMV reverse transcriptase at 42°C, terminated by 
the addition of gel loading buffer, and sized against a 
sequencing ladder following denaturing polyacrylamide gel 
electrophoresis . 

Referring to Figure 1, cDNA clones isolated from an activated 
macrophage library carrying the resistant allele of murine Nramp 
were restriction mapped, sequenced and found to be identical over 
the coding region of the published (Ref 2) sequence, except for 
two silent mutations (359bp, C; 965bp, T) . Regions of sequence 
identity between the published clone and macrophage clone X8.1, 
which includes the ATG (d) codon and the major ORF (solid bars) 
of the published clone, are shown within the broken lines. The 
positions of Smal (S) and PvuII (P) cleavage sites demonstrate 
divergence between the clones at the 5' end. Novel sequence 
identified in the 5' region of X8.1 contained a more proximal ATG 
(p) codon and an extended ORF (open bar) encoding an additional 
64 N-terminal residues compared with the published Nramp. 

Figure 2 shows the sequence of macrophage X8.1. The nucleotide 
sequence specific to X8 . 1 is underlined. The 64 N-terminal 
residues encoded by X8.1 occur 5' of the distal initiation codon 
(=Met at position 65) identified by Vidal and coworkers (Ref 2) . 
This additional 64 amino acid sequence is identical in resistant 
and susceptible mice (data not shown) and is rich in Ser 10/64, 
Pro 10/64, basic 7/64 residues, and contains 3 consensus PKC 
phosphorylation sites (S/T-X-R/K) on Ser 3, 37, and 52. As 
described previously (Ref 2), putative N-linked glycosylation 
sites occur at residues 311 and 325, and hydrophobic potential 
membrane -spanning domains are underlined. Database searches also 
revealed a B-2 alu-like repetitive element (boxed) within the 3' 
UTR, which produces complex signals when the full-length X8.1 
clones is hybridised to mouse genomic DNA. 
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nucleotides 31-456 of X8.1 and spanning the point of divergence 
with the published (Ref 2) clone was isolated and sequence 
determined tc elucidate the mechanism generating the two clones. 
The additional sequence of macrophage Nramp is encoded by two 
unique exons (1 and 2; solid bars) contiguous in the cDNA 
sequence (figure 1) with exons (3 and 4; solid bars) common to 
both X8.1 and the published clone. Contiguous with and 5' of the 
third exon is the putative 5' UTR (open bar) found in the 
published clone. Predicted splicing patterns (dotted line) are 
indicated above (X8.1) and below (published pre-B cell clone) the 
map. Also shown are the sequencing gel reads (arrows) determined 
using specific Nramp primers from the cDNA sequence, as well as 
new primers specific to the genomic sequence. 

Figure 4 shows the sequence of genomic DNA spanning the point of 
divergence of X8.1 and the published (Ref 2) sequence. Exonic 
nucleotide sequence is shown in capitals, with the predicted 
amino acid sequence indicated above in single letter format. 
Intron sequence is shown in small letters. The region of 5' UTR 
from the published clone, contiguous with the third exon, is 
highlighted by overlining. The codon (ATG = Met) where this 
terminates indicates the initiation codon of the published 
sequence. A probe containing sequence unique to the 5' region 
(bp 1 - 587) of the mouse genomic sequence also hybridises to 
genomic Lamda clones isolated from a subcloned human YAC (clone 
AM11/D3/14; ICRF library) known to hybridise to the homologous 
region of human 2q35. 

Figure 5 shows that Nramp transcripts encoding the additional 
64 amino acids are the only form of Nramp expressed in the 
macrophage . 

Referrina tc Piaure 5(a), to identify the nature of Nramp 
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(lanes 2,6,10,14), LPS (lanes 3,7,11,15), or interferon-7 plus 
LPS (lanes 4,8,12,16). Probes specific for unique X8.1 
5'sequence (lanes 1-4), or for the more distal putative 5 ' UTR of 
the published (Ref 2) sequence (lanes 5-8) , were used and 
compared against the same blots reprobed with constitutively 
expressed GAPDH (lanes 9-16) . Hybridizing RNAs could only be 
detected with the X8 . 1-specif ic sequence, despite loading twice 
as much RNA on blots hybridised with the more distal probe. 
Results are shown for RNA extracted from bone marrow-derived 
macrophages from C57BL/10ScSn mice. Slot blot analysis (not 
shown) confirmed that the \8 . 1 -specif ic probe hybridized to RNA 
from both susceptible and resistant macrophages. Southern blot 
analysis (not shown) confirmed that both probes hybridized to 
EcoRI fragments of 3500 and 500 bp in mouse genomic DNA from both 
C57BL/lOScSn (Lsh s ) and B10.L-Lsh r mice. 

Referring to Figure 5(b), to identify the 5 'terminus of Nramp 
transcript expressed in macrophage RNA, primer extensions were 
performed with 10/xg of total RNA from B10.L-Lsh r macrophages 
using oligonucleotides specific to the putative 5' untranslated 
region of the published (Ref 2) sequence (lane 1) , or to 5' 
sequence unique to X8.1 (lane 2). The numbers of nucleotides 
from the 5' end of the primer are shown. Control reactions with 
tRNA gave no products with either primer. These experiments 
confirm that RNA transcripts bearing the putative 5' untranslated 
region of the published cDNA are not present in resting (not 
shown) or activated macrophages, whereas transcripts 
corresponding tc the X8 . 1 sequence were identified with 
transcriptional initiation sites mapping 21 and 22bp (doublet) 
5' of the proximal ATG codcn. Similar results were obtained 
using RNA from C57BL/l0ScSn (Lsh s ) macrophages as template. 

Figure 6 shows that macrophage Nramp encodes an N-terminal SH3 

sequer.cu unique uu T.acropnag^ -exii-dd-..; ;u - .i-_u . * . 
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of sequence matches particularly with the proline rich sequence. 
Multiple sequence alignments allowed for the generation of a 
consensus motif over this region: PGPAPQPXPXR (solid vertical 
bar) . Matches were found for three molecules involved in signal 
transduction: the focal adhesion kinase (Ref 20) (50% identity- 
over 26 residues) ; Drosophila dynamin shibire protein (Ref 17) 
(55% identity over 20 residues) ; and the adenylate cyclase 
stimulatory beta- 1 -adrenergic receptor (Ref 19) (57% identity 
over 21 residues) . A proline, serine rich domain has been 
identified as a functional SH3 binding domain in dynamin (Refs 
17,3). The nine best matches were aligned with each other and 
residues boxed where four or more exhibited identities. Also 
shown are the two PKC sites (hatched vertical Lara) on S3 and S37 
which flank the region exhibiting sequence identity. Tyrosine 
residues (asterisk) occur on either side of the consensus motif 
indicating conservation of this part of the sequence. 

Figure 7 shows the nucleotide and deduced amino acid sequence of 
exon 2 of human NRAMP obtained from genomic sequence analysis. 

Exon 2 in murine Nramp encodes the putative SH3 binding domain 
with amino acid matches to a number of signal transduction 
molecules . To characterise the structure of the same region of 
the human NRAMP gene, a yeast artificial chromosome hybridising 
to 2q35 was subcloned into EMBL3 and the resulting library 
screened by cross -species hybridisation using a murine \probe to 
identify clones containing this exon. Genomic sequence across 
exon 2 was obtained (figure 7), with splice doner and acceptor 
sites conforming to the GT AG boundaries as identified in the 
murine sequence . 

Figure 8 shows that N-terminal sequence for human NRAMP encodes 
an SH3 binding domain structure . 
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sequence maintained (Figure 8) . These include the distal 
consensus sequence for phosphorylation by PKC. Both tyrosine (Y) 
residues are maintained and share identical positions as murine 
Nramp, and the human exon 2 sequence is rich in serine (S 9/4 8 
compared to 9/45 in mouse); proline (P 10/48 compared to 10/45 
in mouse); and basic (5/48 compared to 5/45 in mouse) residues. 
Of these important residues, 6/9 S, 6/10 P, and 4/5 basic 
residues show identical positions within murine and human exon 
2. The spacing of the prolines are subtly different in the 
consensus sequence for the SH3 binding domain of the human gene : 
at positions 1 4 7 9 13 14 compared with 1 3 5 7 9 in mouse. The 
human consensus motif over this region is: PXSPTSPXPXXAPPRXT . 
The 3 codon insertion in human exon 2 forms the 5' segment of 
this proline rich domain. This insertion region has an unusual 
nucleotide sequence consisting of an almost perfect 3 times 9 
nucleotide repeat, representing a region of some instability and 
source of polymorphism in man (Ref 32) which could influence 
function. The presence of the extra 3 codon segment within the 
human gene sequence produced some additional amino acid sequence 
identities on screening databases. These include several 
proteins involved in cytoskeletal interactions or signal 
transduction pathways: microtubule associated protein 4; adenylyl 
associated protein; phospholipase C /?3 ; phosphatidylinositol 3- 
kinase regulatory subunit p85a (PI3 -kinase p85a) ; ankyrin; and 
zyxin . 

Computer-assisted analysis. 

Hydropathy profiles of the predicted N-terminal amino acid 
sequence of macrophage -expressed Nramp were obtained by computer- 
assisted analysis using the algorithm and hydropathy values of 
Kyte and Doolittle (Ref 14) . Amino-acid sequence comparisons 
were made using the FAST A programme on-line to the CRC Resource 
Centre . 
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Nramp cDNAs, 14 of which differed from the published (Ref 2) 
Nramp in the 5' terminal sequence. The longest macrophage - 
derived cDNA (Fig. 1; X8.1) was 186bp shorter than Nramp. It 
contained the full length coding region for the previously 
predicted protein, and exhibited 100% identity with no in frame 
stop codons for the region (bp 209-263) of untranslated sequence 
immediately 5' of the published initiation codon. However, 
nucleotides 1-208 of X8.1 shared no identity with the published 
sequence. A more proximal ATG codon was identified at 72bp in 
X8.1, preceeded by an in frame stop codon at 3 6bp. This proximal 
translational initiation codon is followed by an ORF of 192bp (64 
amino acids) that leads into the ORF previously reported. 
Previous studies have shown that proximal initiation codons are 
utilised in more than 90% of all genes analysed (Ref 15) . Nor 
was there any evidence that the distal initiation codon would be 
favoured, since both distal and proximal initiation codons and 
flanking sequences are identical (TCCTC ATG A) and display only two 
identities with the optimal (Ref 16) ( CC A / G CCATGG) consensus. 
Hence, there is no a priori reason why the distal initiation 
codon would be used. 

Genomic sequence for the 5' region of Nramp. To determine 
whether mechanisms exist which could generate two RNAs and hence 
two types of Nramp clones, a region of genomic DNA spanning the 
point of divergence was characterised corresponding to 
nucleotides 31-456 of X8 . 1 (Fig. 2). This region is encoded by 
four exons interspersed by three introns of 395, 90C and 241bp, 
with all splice donor and acceptor sites conforming to the GT and 
AG boundaries. The first 47 amino acids of the 64 amino acid 
N-terminal domain of X8 . 1 are encoded by two proximal exons 
unique to this clone. The remaining 17 amino acids are encoded 
by exon three, with exons three and four common to both X8.1 and 
the published (Ref 2) Nramp cDNA. The 5' UTR sequence from the 
published clone was found in the 900bp intron contiguous with and 
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published Nramp sequence it contains both coding and non- coding 
sequence . Although a complex mechanism involving alternative 
splicing associated with an internal splice acceptor site and 
dual promoter control could be formulated to describe the origin 
of both forms, it seems more likely that the published (Ref 2) 
cDNA clone contains a fragment of the 900bp intron at its 
proximal end. This is consistent with the observation that a 
number of the macrophage -derived Nramp clones isolated here were 
found to contain sequence that exhibited identity with the first 
Nramp intron identified in genomic DNA (not shown) . 

Only one form of Nramp is expressed in macrophages. To confirm 
the hypothesis that the RNA encoding the longer polypeptide is 
the form expressed in macrophages, a number of different 
experimental approaches were adopted (Fig. 3) . Using macrophage 
RNA as template, primer extension with an oligonucleotide unique 
to the 5' region of X8.1 yielded products in both susceptible and 
resistant mice. In direct contrast, no products were generated 
using an oligonucleotide within the putative 5' UTR of published 
(Ref 2) Nramp. A probe covering the 5' region unique to X8.1 
also hybridized well to Northern and slot blots of macrophage RNA 
from susceptible and resistant mice, whereas a probe covering the 
putative 5' UTR of the published clone showed no hybridization. 
Hence, the only form of RNA transcript present in macrophages is 
that which conforms to the A8.1 predicted polypeptide sequence, 
suggesting that this form of the Nramp gene is responsible for 
host resistance. 

Predicted structure and sequence identities across the N-terminus 
of macrophage-expressed Nramp. In order to determine hew 
macrophage -expressed Nramp might relate to Lsh/Ity/Bcg gene 
function, hydropathy (Ref 14) plots and amino acid database 
searches were undertaken over the newly identified 64 amino acid 
domain. The former (data not shown) demonstrated that the new 
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three PKC phosphorylation sites (in addition to the two 
identified in the published Nramp sequence) , and a number of 
matches with several unrelated proteins (Fig. 4). The most 
intriguing matches were: (i) with the dynamin shibire protein 
(Ref 17) of Drosophila, related to mammalian dynamin (dephosphin) 
which acts as a synaptic phosphoprotein in rat brain (Ref 18) ; 
(ii) with the proline rich third cytoplasmic domain of the 
adenylate cyclase stimulatory and G protein coupled beta-1- 
adrenergic receptor (Ref 19) ,- and (iii) with the focal adhesion 
kinase (Ref 20) that can be modulated by integrin-dependent 
phosphorylation (Ref 21) . The region of identity with the C- 
• terminal domain of dynamin has been implicated (Ref 22) in 
binding anionic phospholipids, microtubules and Src homology 3 
(SH3) domains. SH3 domains (Refs 3,4), identified as related 
sequences in different tyrosine kinases (TK) but outside the 
catalytic domain, are modular and found in a number of proteins 
such as the non-receptor TKs, phospholipase C-gamma and other 
structural proteins of the cytoskeleton . Whilst the function of 
SH3 domains (Ref 4) is not as well characterised as the SH2 
counterpart, it is believed they mediate specific protein-protein 
interactions obligatory for signal transduction (Ref 3) . IL-2R 
beta (Ref 25) and erythropoietin (Ref 2G) receptors, for example, 
exhibit serine and proline rich intracellular domains which 
associate with TKs mediating phosphorylation essential to 
receptor function. Members of the Src family of membrane- 
associated TKs, including Hck and Fgr, are also found in 
macrophages (Ref 27) . Both exhibit differential kinetics in 
response to priming/activation signals and could be implicated 
in Nramp -mediated signal transduction pathways. Kck, in 
particular, has recently been shown to be involved in signal 
transduction for TNF-a production in murine macrophages (Ref 28) , 
a step which we have demonstrated (Ref 11) is crucial in the 
pathway to enhanced nitric oxide production and antimicrobial 
activity in Lsh resistant macrophages. 
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which mediate binding to extracellular matrix proteins and signal 
via TKs, is sufficient to stimulate enhanced TNF-a production in 
resistant but not susceptible macrophages (Ref 9) . Overall, the 
multiple PKC phosphorylation sites on Nramp, together with the 
new SH3 binding domain identified here, provide compelling 
evidence that Nramp mediates resistance by controlling signal 
transduction for macrophage priming/activation. 

N-terminal sequence analysis of human NRAMP supported the 
findings with murine Nramp in showing sequence identity over the 
putative SH3 binding domain with a series of proteins involved 
in cytoskeletal interactions or signal transduction pathways. 
Of these, PI3-kinase p85a (Ref 10) is of particular interest 
because it functions by binding to phosphorylated protein 
tyrosine kinase via SH2 domains (Ref 12), and acts as an adaptor 
mediating the association of the pllO catalytic unit to the 
plasma membrane. PI3 -kinase p85a also has an SH3 domain. 
Ankyrin B (Ref 13) is a molecule linking integral membrane 
proteins to cytoskeletal elements, and zyxin (Ref 23), an 
adhesion plaque protein and a possible component of a signal 
transduction pathway mediating adhesion-associated gene 
expression. Overall, this evidence supports our earlier 
conclusion based on the putative SH3 binding domain of the murine 
gene that this domain is important in protein-protein 
interactions important in signal transduction, and/or protein 
interactions (e.g. binding of tyrosine kinases mediating 
phosphorylation on tyrosines) which regulate the transport 
function of the molecule. 

Nramp gene transfer studies . 

A number of Nramp retroviral vector constructs were made, all 
based on the pBabe plasmid. These include the cDNAs encoding the 
predicted protein described above, together with a C- terminal 
deletion construct encoding the proximal 72 amino acids of the 
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marker gene which confers resistance to puromycin. A number of 
these resistant clones have been tested for their ability: (i) 
to secrete functional pseudovirus particles by RNA slot blotting 
and hybridisation with an Nramp probe; and (ii) to infect 
recipient cells and confer antibiotic resistance. Infectious 
particles from the highest titre lines will be used for in vivo 
gene transfer. This same construct has been introduced into a 
murine macrophage cell line (RAW 264) which expresses a different 
allelic ("Lsh susceptible") variant from that of the vector- 
derived Nramp gene . 

Several clones have been identified that co-express both forms 
of Nramp as monitored by PCR followed by allele-specif ic 
oligonucleotide hybridisation. Functional experiments have been 
performed to demonstrate that Nramp is the disease resistance 
gene Ity/Lsh/Bcg, by demonstrating that the resistant allele 
confers macrophage activation phenotypes previously associated 
with the action of the Ity/Lsh/Bcg gene. More specif ically : - 

Table 1 demonstrates that the Nramp resistant allele confers an 
enhanced baseline PMA-elicited respiratory burst response 
compared to the control susceptible transfectant clones. This 
resting PMA-elicited respiratory burst is completely extinguished 
in susceptible but not resistant transf ectants following 
treatment of the macrophages with bacterial lipopolysaccharide 
(LPS) . Respiratory burst products mediate antimicrobial and 
tumouricidal activity. 

Table 1 . Resistant allele RAW264.7 transf ectants generate 
enhanced RB responses which are not extinguished following LPS 
stimulation. PMA-elicited RB was measured using a standard assay 
in which superoxide reduces nitro blue tetrazolium to formazan 
in (a) resting resistant and susceptible transf ectants , and (b) 
after 24 or 30 hours incubation with LPS (2 5ng/ml) . To normalise 
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p<0.001 = ***) for results of student's t-tests used to compare 
each resistant transfectant against the susceptible transfectant 
30S. Similar levels of significance were observed for 
comparisons with 10S and 25S. Results representative of 5 
independent experiments performed. 



Transfectant 



24 hour 
resting cells 



24 hour 
LPS/IFNy treated 



10S 

25S 
30S 



0 . 155+0 . 013 
0 .181+0 . 026 
0.147+0.025 



0 .025 + 0 . 006 
0 . 002+G .003 
0.034+0. 008 



7 . 1R 
7 . 2R 
7.5R 
7.8R 
7 . 11R 
17 . 1R 
17 . 5R 



0 .296 + 0 . 056* 
0.399+0.077*** 
0 . 442+0 . 080** 
0.291+0.019** 
0.290+0. 069** 
0.389+0.082* 
0 . 329 + 0 . 056** 



0 . 219 + 0 . 022*** 
0 .292 + 0 . 029*** 
0 . 181 + 0 . 097* 
0 .364 + 0 . 052*** 
0 .308 + 0 . 036*** 
0 .272 + 0 . 059** 
0 . 230+0 . 018*** 



Table 2 demonstrates that the Nramp resistant allele confers 
enhanced nitrite release following priming/activation with LPS 
and interferon-y (IFNy) . Nitrites are the stable end-product of 
nitric oxide generated by upregulated expression of the inducible 
nitric oxide synthase gene in resistant macrophages. Nitric 
oxide also mediates antimicrobial and tumouricidal activity, and 
is specifically known to be the final effector mechanism for 
antileishmanial and antimycobacterial activity in murine 
resistant macrophages. 
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endproduct of NO production using the Griess reagent. Cells were 
incubated for 24 or 30 hours in the presence of LPS (25ng/ml) or 
LPS plus IFN7 (25 ng/ml and 25 U/ml) . Determinations were 
normalised to cell number from the crystal violet staining 
intensity of a parallel plate and results are presented as the 
ratio of nitrite to crystal violet . Asterisks indicate 

significance levels (p<0.05 = * ; p<0.01 = **; p<0.001 = ***) for 
results of student's t-tests used to compare each resistant 
transfectant against the susceptible transf ectants 10S and 30S. 
Clones 17. 3R and 17. 6R developed from an independent transf ection 
also showed significantly (p<0.05) higher NO levels in this 
experiment. Results representative of 5 independent experiments 
performed . 



Transfectant LPS alone LPS + IFN7 



10S 0.043+0.008 0.296+0.026 

30S 0.009+0.008 0.274+0.027 

7.2R 0.215+0.005*** 0.603+0.059** 

7.5R 0.256+0.062*** 0.857+0.059*** 



Table 3 demonstrates that the Nramp resistant allele confers 
enhanced L-arginine uptake following priming/activation with LPS 
and IFN-7. L-arginine provides the substrate for generation of 
nitric oxide involved in signal transduction for upregulated 
expression of KC in resistant macrophages (Ref 24) , and for the 
final effector mechanism for cidal activity of the macrophage. 

Table 3 . L-arginine uptake is enhanced in resistant 

transf ectants compared to susceptible following activation with 
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contribution from serum. Pilot experiments demonstrated that the 
uptake of [ 3 H] L-arginine (0.25 /iCi , spe cific activity 58 
Ci/mmol) from 10 5 cells was linear over a one hour time period at 
37°C. In all subsequent experiments cells were pulsed for 3 0 - 
45 mins. The incubation was terminated by removing the media and 
washing the adherent cells 3 times in PBS containing 10 mM 
unlabelled L-arginine. Cells were lysed in 50 ^1 of 1% SDS and 
counted in 5 ml of aquasol II (DUPONT-NEN) . Results are 
expressed as the percentage stimulation + standard deviation 
observed in 6 hour LPS + IFN-y treated macrophages compared to 
untreated controls. Asterisks indicate significance levels 
(p<0.05 = *; p<0.01 = **; p<0.001 = ***) for results of student's 
t-tests used to compare each resistant Lransfectant against the 
susceptible transf ectants 2S, 10S and 30S. Results 
representative of 5 independent experiments performed. 



Transfectant Percent Enhancement 



2S 

10S 

30S 



7 . 5R 
7.8R 
17 .1R 
17 .3R 



108 + 5 

119+10 

122+10 



194+24*** 
204+40** 
168+11* 
186+5** 



This demonstration that Nramp influences three independent 
pleiotropic effects of the gene previously associated with 
Ity/Lsh/Bcg function provides definitive evidence that Nramp is 
Ity/Lsh/Bcg . 



WO 95/20044 



PCT/GB95/00095 



(25) 

these Nramp regulated pleiotropic effects rely on intracellular 
signalling mediated by the generation of mitochondrially-derived 
reactive oxygen intermediates (ROD . 

Table 4 demonstrates that respiratory burst and L-arginine uptake 
are inhibited in the presence of the mitochondrial electron 
transport inhibitors rotenone (0-40 ixM; inhibits complex I -» 
ubiquinone) or thenoyltrif luoroacetone (TTFA; 0-400 /xM; inhibits 
complex II -> ubiquinone) . Concentrations of inhibitors were 
based on previous studies (Ref. 61) examining the role of 
mitochondrially-derived ROI on apoptosis and the gene -inductive 
effects of TNFa in fibroblasts, and were not observed to have 
toxic effects on the RAW2 64 .'/ -derived cransfectant lines. 

Table 4 . L-arginine uptake experiments were performed in the 
presence of the radical scavengers nordihydroguaiaretic acid (0- 
40 /xM) and butylated hydroxyanisole (0-400 (iM) . Respiratory 
burst and L-arginine uptake experiments were also carried out in 
the presence of the mitochondrial electron transport inhibitors 
rotenone (0-40 fj.M; inhibits complex I -» ubiquinone) or 
thenoyltrif luoroacetone (TTFA; 0-400 pM; inhibits complex II -» 
ubiquinone) . Cells were allowed to adhere to microtitre wells 
for 1 hour prior to a 1 hour pretreatment with drugs before 
addition of activation agents for appropriate time periods . 
Results are- presented for rotenone inhibition (percent of 
control) of L-arginine and RB for the resistant transfectant 
clone 7 . 5R examined after treatment with LPS/IFNy. 
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Rotenone Concentration (/xM) 


L-arginine uptake 


RB 




0 


100 


100 




5 


66 + 16 


52 + 3 


7 


10 


79 + 17 


33+4 


4 


20 


73±7 


16+5 


4 


40 


74 + 11 


11+3 


7 



These findings imply a role for Nramp in regulating mitochondrial 
function and the generation of reactive oxygen intermediates for 
signalling. Thus there are two ways in which Nramp may influence 
intracellular signalling for macrophage activation: (i) by 
influencing the generation of reactive oxygen intermediates from 
the mitochondrion; and (ii) by enhanced generation of nitric 
oxide. These studies of Nramp gene function bring together the 
decade of functional work demonstrating that Nramp regulates 
macrophage priming/activation for antimicrobial activity, with 
the many pleiotropic effects of the gene due to its role in 
regulating cell signalling events. The crucial significance of 
the putative SH3 binding domain in the function of the Nramp gene 
is that it regulates its function in response to 
priming/activation signals. 

Nramp protein and antibody production . 

On the basis of hydropathy plots the applicants have selected two 
structural domains that are not hidden by membrane and therefore 
are likely to be accessible within the intact /nat ive protein 
conformation. Oligonucleotide primers to these two domains (N- 
terminal ammo acids 1-82, C-terminal amino acids 514-548) were 
generated with restriction sites allowing the amplified products 
to be cloned in the appropriate reading rrame in the pGEX series 
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enabling the induction of high level expression of fusion 
proteins that can be easily purified from bacterial lysates by 
affinity chromatography using glutathione agarose. Bound 
proteins can be released from the matrix under mild conditions 
such that the native conformation is maintained to improve 
antigenicity. This system has been employed to generate Nramp 
proteins of approximately 8.2 and 3.4 kd. from the N-terminal and 
C-terminal regions respectively, which have been used as antigens 
with the RIBI adjuvent to innoculate rabbits for production of 
polyclonal antibodies and rats for production of monoclonal 
antibodies. In order to ensure that antibodies raised will be 
specific to both murine and human Nramp/NRAMP proteins, these 
antibodies should be screened or affinity purified against 
peptides prepared cn the basis of sequence information across 
these N-terminal and C-terminal regions used for production of 
the fusion protein. Specifically, against peptides 

DKSPPRLSRPSYGSISS; PQPAPCRETYLSEKIPIP; and 
GTFSLRKLWAFTGPGFLMSIAFLDPGNIESDLQ within the N-terminal region, 
and WTCCIAHGATFLTHSSHKHFLYGL in the C-terminal region. 

Genomic Organization and Sequence 
of Human NRAMP gene (Ref 62) 

Genomic sequencing of NRAMP. A human yeast artificial chromosome 
(YAC) AM11/D3/14 (Ref 30), obtained from the ICRF library 
(available through the UK Human Gene Mapping Project HGMP 
Resource Centre, Huxton Hill, Cambridge C310 IRQ, UK) by 
screening with a VI LI probe (Ref 31) and containing the entire 
human NRAMP sequence (Ref 32), was sublconed into AEMBL3 
(Stratagene Ltd, Cambridge, UK) and screened with the full-length 
murine Nramp cDNA X8 . 1 (Ref 55) . Two overlapping clones, X3 and 
XB.l, containing the full-length NRAMP sequence, were digested 
with PstI, sublconed into pBluescript II SK (Stratagene Ltd) , and 
re-screened with the full-length murine cDNA probe (Ref 55) . 
Exon positive clones were selected for sequence analysis, with 
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Human cDNA sequence was obtained by reverse transcription (RT) 
and PCR amplification of RNA prepared from the human monocyte- 
derived THP1 cell line (Ref 33) . Where appropriate, PCR products 
were cloned into the pCR vector (Invitrogen Corporation, 
Abingdon, UK) for sequence analysis from at least 2 independent 
clones . Clones corresponding to the 3 ' region were not 
originally isolated by screening with the murine cDNA. A 
fragment was generated by 3' rapid amplification of cDNA ends 

(RACE; (Ref 34) ) from polydT adaptor primed THP1 cDNA . cDNA was 
amplified using the adaptor primer in combination with 2 nested 
primers selected from exon 13 (GTGCTGCCCATCCTCACG; 
GAGTTTGCCAATGGCCTG ) . A suitable genomic clone was prepared by 
amplification of a fragment from both A3 and the YAC AM11/D3/14 
using exon 13 primers and a primer ( GGACGAGAAGGGAACTAG ) designed 
from the 3' end of the RACE product. The 5' end of the RNA was 
mapped by 5' RACE involving RNA ligase-dependent ligation of a 
blocked anchor primer to the 3' end of random hexamer primed 
reverse transcribed THP1 RNA. Amplification using an anchor 
primer and two NRAMP -specific nested antisense primers 

(AAGAAGGTGTCCACAATGGTG , CGGTTTTGTGTCTGGGAT) yielded a single 
NRAMP product. The product was TA cloned and 3 clones subjected 
to sequence analysis to determine the transcriptional initiation 
site and sequence of the most proximal exon that failed to 
hybridise to any mouse cDNA probe. This facilitated further 
analysis of the 5' flanking region, the sequence for which was 
obtained from a 1.6 kb PstI fragment that contained sequence 
homologous to the 5' RACE product. 

Analysis of sequence data. Nucleotide and amino acid sequence 
comparisons were made using the BESTFIT programme cn-line to the 
CRC Resource Centre, UK. Amino acid sequences for murine and 
human NRAMP were aligned with yeast SMF1 and SMF2 (Ref 35) using 
the multiple sequence alignment program Clustal V (Ref 36) . 
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exons 4 and 6 of human NRAMP, using RNA purified from peripheral 
blood mononuclear cells. This product spans the region of murine 
Nramp which carries the susceptibility mutation. PCR products 
were purified with a Qiagen PCR purification kit (Hybaid Ltd, 
Teddington, UK) , and subjected to direct cycle sequence analysis 
using the Circumvent Thermal Cycle Dideoxy DNA Sequencing Kit 
(New England Biolabs, CP Laboratories, Bishop's Stortford, UK) 
with an internal sequencing primer (CATCTCTACTACCCCAAGGTGC) . 
Direct cycle sequence analysis was performed on 19 individuals: 
8 visceral leishmaniasis patients, 9 unaffected individuals taken 
from the same families, and 2 nonendemic British controls. 
Endemic samples were from Brazil (4 affecteds; 5 unaffecteds) and 
the Sudan (4 affecteds,- 4 unaffecteds). 

Primer design and PCR analysis of a 5' gt repeat using human 
genomic DNAs . PCR products of 780-794 bp were amplified from 
genomic DNA using primers located -365 bp 5 ' of the transcription 
start site (GAGGGGTCTTGGAACTCCA) and within intron 1 
(CACCTTCTCCGGCAGCCC) . This product was reampiified to generate 
108-122 bp products using the 5' primer and an end-labelled 
(Y 32 PdATP; ICN Biomedicals Ltd, Thame, UK) internal reverse 
primer TACCCCATGACCACACCC . The products were resolved by 
denaturing polyacrylamide gel electrophoresis and sized using a 
sequencing ladder. PCR products corresponding to different 
allelic forms were directly sequenced as described above . 

Family linkage studies. A set of 36 multicase families of 
leprosy, tuberculosis and visceral leishmaniasis from our study 
site in Brazil (ref 37) were used to determine linkage between 
a polymorphic gt repeat in the 5' promoter region of human NRAMP 
and previously mapped 2q34-q35 markers (Refs 32, 37) . Two-point 
linkage analyses were carried out between NRAMP and the markers 
(TNP1, IL8RB, VIL1, DES) using LINKAGE (Ref 38) on-line to the 
CRC Resource Centre. Gene frequencies for the NRAMP alleles were 
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Results 

Referring to Figure 9, Clustal V multiple sequence alignment is 
shown for the deduced amino acid sequence for human NRAMP, murine 
Nrawp clone X8.1 (Ref 55), and the yeast mitochondrial proteins 
SMF1 and SMF2 (Ref 35) . Residues showing 3/4 or 4/4 identities 
across the 4 proteins are shown in bold. For the NRAMP sequence: 
exon boundaries are indicated above the sequence; PKC indicates 
consensus sites (S/T-X-R/K) for protein kinase C phosphorylation; 
=== indicates consensus sites for N- linked giycosylation ; and 
putative membrane spanning domains ( (Ref 2) are overlined and 
numbered on the sequence. * indicates cysteine residues 
conserved across all 4 proteins; . indicates conserved 
substitutions . 

Referring to Figure 10(A), results of amino acid database 
searches for exon 2 are shown identifying a number of sequence 
matches with the Pro/Ser rich putative SH3 binding domain of 
NRAMP; + represents a conserved amino acid. Residues showing 4 
or more identities are in bold. Multiple sequence alignments 
allowed for the generation of a consensus motif over this region 
as shown by double underlining. Also shown is the PKC site on 
S40, and tyrosine residues (*) on either side of the consensus 
motif. In Figure 10(B), Clustal V multiple sequence alignment 
is shown for human NRAMP, mouse Nramp, SMF1 and SMF2, and the 
expressed sequence tags (Ref 60) of Oryza sativa (rice; accession 
number dl5268) and Arabidopsis thaliana (accession number 230530) 
genes, reading frames 1 and 2 respectively. Residues shewing 
>4/6 identities across the 6 proteins are in cold. Membrane 
spanning domains 6 and 7 for NRAMP are overlined and numbered on 
the sequence. The 20 amino acid conserved transport motif (Ref 
2) is indicated by double overlining. All 6 proteins show 
identities (similarities) of 7/20 (11/20) across the transport 
motif. * indicates cysteine residues conserved across all 6 
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promoter region human NRAMP sequence 5' of the transcription 
start site. The transcription start site is located 148 bp 5' 
of the ATG initiation codon, as indicated. Putative promoter 
region elements identified by inspection (indicated above the 
sequence) include: a possible Z-DNA forming dinucleotide repeat 
t (gt) 5 ac (gt) 5 ac (gt) 9 g; 6 interf eron-7 response elements; 3 W- 
elements; 1 API site; 3 NFkB binding sites; and 9 purine-rich 
GGAA core motifs (2 on the antisense strand) for the myeloid- 
specific PU.l transcription factor, two of which combine with 
imperfect API-like sites to create PEA-3 consensus motifs. 
Strings of heat shock transcription factor (HSTF) motifs (NGAAN 
or NTTCN) also occur across the 44 0 bp sequence (not marked) . 

Referring to Figure 12, two families are shown regregating for 
(a) alleles 2 and 3, or (b) alleles 1, 2 and 3 of the 5' 
dinucleotide repeat polymorphism. Photographs below the families 
show autoradiographs of polymorphic PCR products (122 bp, 12 0 bp, 
and 118 bp for alleles 1 to 3, respectively) separated by 
denaturing polyacrylamide gel electrophoresis. Lanes from left 
to right on each photograph show individuals (a) 1-2, II-l, II-2, 
II-3, II-4, II-5, II-6, III-l, III-2, III-3; and (b) 1-1, 1-2, 
II-l, II-2, III-l, III-2, III-3, III-4, III-5, III-6 as indicated 
on the pedigrees. Individual 1-1 is not shown for family (a) 

Sequence and genomic organization of human NRAMP. The sequencing 
of exon positive clones isolated by hybridization with a full- 
length cDNA allowed for the identification of the complete 
sequence (deposited with EMBL under accession numbers X82015 and 
X82016) of the human 2q homclogue (NRAMP) of the murine 
chromosome 1 derived Nramp gene. Analysis of exon sequence from 
a region 440 bp 5 ' of the transcriptional initiation site to the 
termination codon allowed for the complete exon-intron 
organization to be elucidated (Tables 5 and 5) . Human NRAMP is 
encoded by 15 exons and, in constrast to the 548 amino acid 
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Kozak (Ref 16) consensus. The next, more distal codon found at 
M68 has a 2/6 Kozak consensus. However, we propose that like the 
murine macrophage form (Ref 55) , the more proximal initiation 
codon will be utilised. This is reinforced by the striking 
(100%) sequence conservation for residues 51-67 (Fig. 9) , 
indicating a requirement for the maintenance of sequence for 
function. The discrepancy in size between murine (548) and human 
(550) genes results from the inclusion of 3 additional residues 
within exon 2 causing a PTS duplication, with the non-duplicated 
form representing a rare variant in Brazilian (Ref 32) and 
British (unpublished data) pedigrees. In addition, the human 
•gene exhibits a single amino acid deletion relative to the mouse 
within the poorly conserved last exon. Overall ciminu acid 
identity with murine Nramp was 86% (92% with conserved 
substitutions) . Exons exhibiting highest sequence identity 
(100%) include exons 4, 6 and 7, with exon 11 displaying 98% 
identity. These exons encode TM1, the first extracellular 
domain, TM2 and TM3 , and the conserved transport motif. It is 
of interest that TM2 , containing the murine susceptibility- 
associated mutation (Refs 2, 56) is well conserved, suggesting 
that this domain plays an important functional role which cannot 
tolerate amino acid substitutions. NRAMP was aligned with murine 
Nramp and with the two yeast mitochondrial membrane proteins, 
SMF1 and SMF2, using the multiple sequence alignment program 
Clustal V (Fig. 9) . SMF1 and SMF2 , which show 49% identity (70% 
similarity) with each other, show 30% (57%) and 29% (53%) 
identities (similarities) with human NRAMP, respectively. This 
parallels the 30% (58%) and 30% (53%) identities (similarities) 
we reported (Ref 57) for murine Nramp. Regions of most striking 
sequence identity between all 4 proteins were found predominantly 
within the hydrophobic regions, although high identities were 
also found in exons 3, 4, 5 and 6, and for the conserved 
transport motif from exon 11. Within exon 6, the YAC-derived 
amino-acid human sequence exhibited a Gly at residue 172, 



WO 95/20044 



PCT/GB95/00095 



(33) 

do not introduce negatively charged residues found in the 
susceptible allele of mice. As before (Refs 55, 32), matches 
with other proteins (Fig. 10) in the sequence databases were 
observed over exon 2 which contains a putative SH3 binding 
domain; and over the region of exon 11 containing the conserved 
binding-protein-dependent transport motif (Ref 2) . The latter 
was highly conserved (7/20 identity; 11/20 similarity) in 
murine/human NRAMP, the yeast proteins, and in two expressed 
sequence tags from Oryza sativa (rice) and Arabidops is thaliana. 
SMF1 and SMF2 do not demonstrate high identity over the 
proline/serine rich sequence of exon 2, but do have consensus 
(S/T-X-R/K) sequences (one in SMF1; two in SMF2) for PKC- 
dependenc phosphorylation. Human NRAMP has two PKC consensus 
sites (in exons 2 and 3 , Fig. 9) in this region, compared to 
three in the murine gene. The location of the distal site in 
SMF2 matches precisely with human NRAMP site 2/murine Nramp site 
3, whereas the site in SMF1 is located 8 residues upstream. A 
pair of cysteine residues are conserved in all four genes: (i) 
in the first extracellular loop domain; and (2) in the third 
extracellular domain which also contains two sites for N-linked 
glycosylation in the human and murine genes. Charged residues 
are conserved across all 4 proteins within the transmembrane 
spanning domains 1,2,3,4, and 7 (Fig. 9), except for a Lys^Ser 
substitution in the first transmembrane domain of SMFl . 

Analysis of the murine mutation site in visceral leishmaniasis 
patients and controls. To determine whether a mutation 
homologous to the murine disease susceptibility Gly-»Asp mutation 
occurs in man, RT/PCR and direct cycle sequencing was performed 
on RNA from visceral leishmaniasis patients and controls from 
Brazil and the Sudan. All 19 human samples, whether from 
affected or unaffected individuals, encoded a Gly at this 
position . 
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site (Fig. 11) . The transcription start site is located 148 bp 
5' of the ATG initiation codon. A series of predicted promoter 
region elements also occur 5' of the transcription start site, 
including a possible Z-DNA forming (Refs 39, 40) dinucleotide 
repeat t (gt) 5 ac (gt) s ac (gt) 9 g located -317 to -274 bp 5' of the 
transcription start site. On either side of the Z-DNA forming 
dinucleotide repeat are a series of matches to inducible promoter 
element consensus sequences. These include: 6 interf eron--y 
response elements, 1 x 3 ' -»5 ' showing 8/8 matches to the consensus 
sequence CT G / T G / 1 ANN C / T (Refs 41, 42), 3 x 5 ' -»3 ' showing 7/8 
matches, 2 x 3'->5' showing 7/8 matches; 3 W-elements (also known 
as H-, E-, W-, S-, or Z-boxes) , 1 x 3'-»5' showing 8/8 matches to 
the consensus sequence A / T GNA C /*C C / T C / T (Ref 41) , 2 x 5'-»3' with 7/a 
matches; an API site showing 6/7 matches to the consensus 
sequence TGACTCA (Ref 43); and 3 NFkB binding sites, 2 x 5'-»3' 
and 1 x 3'-»5', each showing 7/10 matches to the consensus 
sequence GGG°/ a c /a/ t T c / t c / t CC (Ref 44) . Nine purine-rich GGAA core 
motifs (2 on the antisense strand) for the myeloid-specif ic PU.l 
transcription factor (Refs 45, 46) also occur across this region, 
two of which combine with imperfect API- like sites to create PEA3 
motifs (Ref 47) , and another two are juxtaposed. Strings of heat 
shock transcription factor (HSTF) motifs NGAAN or NTTCN ; (Ref 48) 
were also present, although their order and phase are not 
consistent with currently functional elements. TATA, GC and 
CCAAT boxes were not found within the 440 bp 5' flanking 
sequence . 

Mapping of a polymorphic repeat an the 5' promoter region. The 
presence of a gt repeat in the 5' region of the YAC-derived NRANP 
sequence stimulated further analysis of this region to determine 
whether a polymorphism was present in human population samples. 
Four alleles were observed in Brazilian families (Fig. 12) : 
allele 1 = 122 bp; allele 2 = 120 bp; allele 3 = 118 bp; and 
allele 4 = 108 bp. Direct sequence analysis confirmed that the 
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allele 4 = t (gt ) E ac (gt ) 5 ac (gt ) 4 g . Gene frequencies determined on 
72 genetically independent Brazilians were 0.021 (allele 1) , 
0.326 (allele 2), 0.646 (allele 3), and 0.007 (allele 4), 
providing an overall heterozygosity score of 0.476. Linkage 
analysis generated positive (>3) LOD scores (Table 7) for linkage 
between NRAMP and the four closest markers TNP1 (proximal) and 
IL8RB, VIL1, and DES (distal), consistent with physical mapping 
data (Ref 32) placing NRAMP 130 kb proximal to IL8RB , and 
confirming that this particular polymorphism occurs in the 2q35 
copy of NRAMP rather than in a related sequence (Ref 49) mapping 
to a region in mice homologous to 6q27 in man. 

Discussion 

Genomic sequence analysis presented here demonstrates that the 
human NRAMP gene located on chromosome 2q3 5 has a genomic size 
of 12 kb and contains 15 exons . The amino acid sequence deduced 
from nucleotide sequencing of the 15 exons shows that, like 
murine Nrarnp, NRAMP encodes a polytopic integral membrane protein 
containing both a conserved transport motif (Ref 2) and a 
putative SH3 binding domain (Ref 55) . Over the 20 amino acid 
transport motif, strong sequence identity (7/20 residues; 11/20 
with conserved substitutions) was observed between NRAMP {Nramp) , 
the two yeast proteins SMF1/2, and the expressed sequence tags 
from rice and Arabidopsis , suggesting that this is a functionally 
important motif among phylogenetically distinct organisms. 
Interestingly, these identities are higher than those reported 
(4/20 identity; 6/20 similarity) between murine Nramp and the 
nitrate transporter of Aspergillus nidulans , which led Vidal and 
coworkers (Ref 2) to hypothesise that Nramp might function in 
direct delivery of nitrates into the phagolysosomes of infected 
macrophages. The stronger identity observed here between the 
transport motif of NRAMP and the yeast mitochondrial proteins 
SMF1/2, together with the striking overall similarity between the 
yeast and human/murine genes, suggests that NRAMP may be a 
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possibly at the level of translocation (Ref 35) . Complementation 
experiments with yeast mutants rr.ight therefore reveal more about 
the molecular mechanism of Nramp function. Sequence similarity 
between NRAMP (Nramp) and SMFl/2 was poor over the proline/serine 
rich putative SH3 binding domain. This is perhaps not unexpected 
as these are modular structures that occur in a variety of 
otherwise unrelated proteins involved in signalling and/or 
cytoskeletal attachment (Ref 55) . Hence, this modular motif may 
be a recent addition to the NRAMP molecule related to its 
macrophage-restricted function, and we might expect: that other 
more ubiquitously expressed NRAMP-like molecules will occur. A 
' second Nramp- related sequence has already been mapped in the 
mouse (Ref 49), and others may be found. 

Our major interest in analysing the human NRAMP gene was to 
provide the basis to screening multicase families for 
mycobacterial (tuberculosis and leprosy) and leishmanial 
infections. As a first step, we examined a small group of 
visceral leishmaniasis patients and their unaffected sibs to see 
whether a mutation similar to the murine susceptibility- 
associated mutation (Refs 2, 56) could be found. As might have 
been predicted, exon 6 encoding the second membrane spanning 
domain is highly conserved between murine and human sequences, 
as well as with the yeast genes, suggesting that this is a 
functionally important domain. No mutations were found within 
this region in the 19 human samples examined by direct cycle 
sequencing. Similarly, a polymorphic variant identified by us 
(Ref 32) in the putative SH3 binding domain occurred at very low 
frequency, suggesting that this too might be a region of the 
macrophage -expressed NRAMP molecule which, although recently 
acquired in evolutionary terms, may be critical to its function 
and intolerant to non- conservative substitutions. 

The 440bp of promoter region sequence identified here is of 
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rather than cause structural changes to the molecule. 
Identification of PU.l and PEA3 /API-like response elements is 
consistent with haematopoietic-restricted gene expression (Refs 
47, 50, 51). Although earlier studies (Refs 2, 55) suggest that 
murine Nramp is constitutively expressed in macrophages, the 
inducible promoter region elements identified in the human 
sequence suggest that expression may be regulated by macrophage 
priming/activation stimuli. In particular, interferon-y and W- 
elements are common to other genes (e.g. MHC class II, (Ref 41) ; 
Fc-yRI (Ref 42); iNOS (Ref 52) inducible in macrophages. API and 
NFkB sites also occur in the promoter regions of other 
macrophage -expressed proteins (e.g. tissue factor (Ref 43); iNOS 
(Ref 52) and are required for LPS and TNF inducibility , API 
acting to stabilise and maintain NFkB activity (Ref 43) . Given 
the many functional observations (reviewed in Refs 1, 57-59)) 
demonstrating that the Ity/Lsh/Bcg (candidate Nramp) phenotype 
is so closely allied to the interf eron-7/LPS macrophage 
activation pathway, it will be important to determine the 
functional relevance of these elements to tissue-specific 
expression of NRAMP in different macrophage subpopulations . This 
may be particularly relevant to previous observations 
demonstrating that the Lsh gene phenotype is differentially 
expressed in different macrophage subpopulations (Refs 53, 54), 
and that interaction with extracellular matrix elicits different 
levels of TNFa in bone marrow-derived macrophages from congenic 
resistant and susceptible mice (Ref 9) . Although their order and 
phase were not consistent with currently functional elements, it 
was of interest that strings of HSTF elements were also found in 
the promoter region of human NRAMP. These may represent 
ancestral elements related to the mitochondrial 
activity/expression of the yeast SMF1 and SMF2 genes. 

Another interesting feature of the 5' flanking region of human 
NRAMP was the presence of a putative Z-DNA forming dinucleoti.de 
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regulatory signalling have been attributed to this form cf DNA 
(reviewed Ref 39) . It was particularly intriguing that a 
polymorphism in this repeat unit was observed in human genomic 
DNA samples. The fact that the putative Z-DNA forming repeat is 
flanked on either side by other promoter region response elements 
suggests that this polymorphism may be functionally important in 
determining gene expression, if not on the basis of its own role 
as a transcriptional regulator, at least because it will 
influence the juxtaposition of other response elements. The 
level of heterozygosity (0.476) in the Brazilian population 
studied here made this a useful marker for genetic linkage 
analysis between NRAMP and other 2q markers. However, the number 
of alleles was small compared to other repeat (e.g. 
microsatellite) polymorphisms, suggesting that the generation of 
further polymorphic variants across this repeat may not be 
tolerated in evolutionary terms. This polymorphism may therefore 
be of functional relevance in further analysis of the association 
between NRAMP and disease. 
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TABLES 5 and 6 

Intron (4 flanking nucleotides ) /exon (amino acids) boundaries and sizes (bp) 
for the 15 exons of human NRAMP identified by genomic sequence analysis of 
YAC-derived clones. Amino acid sequence identity with murine Nramp is shown 
for each exon. 

TABLE 5 



Exon 
Number 



Size 
(bp) 



Intron/exon boundaries 



%AA Identity 
(Mouse) 



EXON 1 



EXON 2 



155 



143 



acag 



EXON 4 



EXON 5 



EXON 6 



EXON 7 



EXON 8 



EXON 9 



EXON 10 



EXON 11 



EXON 12 



EXON 13 



EXON 14 



EXON 15 



120 



107 



71 



68 



156 



159 



90 



120 



150 



154 



108 



acag 

tcag 
tcag 
gtag 
gcag 
gtag 
gcag 
gcag 
ccag 
ccag 
ccag 
ccag 



ly Asp 
GT GAC 

Gly 
GGC 

Leu 
CTT 

Val 
GTG 

g lie 
A ATC 

ly Leu 
GG CTG 

Tyr 
TAT 

Ser 
TCT 

Phe 
TTC 

Gly 
GGC 

Gly 
GGC 

Leu 
CTC 

u Leu 
G CTG 

Val 
GTC 



Met Thr G 
ATG. . 145bp. . ATG ACA G gtga 

Lys . . (43aa) . .Lys Pro 

AAG AAA CCG gtgg 

Thr. . (37aa) . .Phe Lys 

arc TTC AAA gtaa 

Leu. . (36aa) . .Pro Lys 

CTC CCT AAG gtgg 

Pro. . (31aa) . .Ala Gly Ar 

CCC GCT GGA CG gtac 

Pro. . (19aa) . .Asn Tyr G 

cca aac tac g gtgg 

Arg. . (18aa) . .Tyr Gin 

CGG TAT GAG gtag 

Val. . (48aa) . .Val Lys 

GTG GTC AAG gtag 

Arg. . (49aa) . .Ala Ala 

CGA GCT GCG gtga 

Asn. . (26aa) . .Gin Gly 

AAC CAG GGG gtga 

Val . . (36aa) . .Met Glu 

GTG ATG GAG gtag 

Phe. . (46aa) . .Leu Leu 

TTC CTG CTG gtag 

Pro. . (2Caa) . .Asn Gly Le 

CCG AAT GGC CT gtga 

Asn. . (47aa) . .Tyr Leu 

AAC TAC CTG gtac 

Trp. . (34aa) . .Ter 
TGG TAG 



50 



68 



95 



100 



91 



100 



100 



88 



87 



80 



98 



94 



84 



73 



S7 
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Table 6 

OVERLINED SEQUENCES REPRESENT REGIONS SELECTED FOR PRIMERS TO 
AMPLIFY INDIVIDUAL EXONS . 



EXON 1 



ATGTAAGAGGCAGGGCACTCGGCTGCGGATGGGTAACAGGGCGTGGGCTGGCACACTTACTT 
GCACCAGTGCCCAGAGAGGGGGTGCAGGCTGAGGAGCTGCCCAGAGCACCGCTCACACTCCC 

M T 

AGAGTACCTGAAGTCGGCATTTCAATGACAGgtgagtagtggcccctagggacagagcctga 
ttggggggtggagtggaggagatcactaggctggtggagacttgaggaagcaagaaagccct 
tggtcccctgtg 



EXON 2 AMPLIFIED REGION 

GDKGPQRLSGSS 

tc ac c a tgc 1 1 c a t gggc c c c c acagGTGACAAGGGTCCCCAAAGGCTAAGCGGGTCCAGCT 

YGSISSPTSPTSPGPQQAPPR 

ATGGTTCCATCTCCAGCCCGACCAGCCCGACCAGCCCAGGGCCACAGCAAGCACCTCCCAGA 

ETYLSEKIPIPDTKP 

GAGACCTACCTGAGTGAGAAGATCCCCATCCCAGACACAAAACCGgtgggacnc tggaaac t 



1 1 c tggggg c t 



EXON 3 AMPLIFIED REGION 



aaggccagctgccaccatccctatacccacacccctcactctactcctcccacccccaacag 

GTFSLRKLWAFTGPGFLMS IA 
GGCACCTTCAGCCTGCGGAAGCTATGGGCCTTCACGGGGCCTGGCTTCCTCATGAGCATTGC 

FLDPGNIESDLQLGPVAGFK 
TTTCCTGGACCCAGGAAACATCGAGTCAGATCTTCAGCTNGGNCCNGTGGCGGGATTCAAAg 



taactaagtcgggacctgagtgggacactt 



EXON 4 AMPLIFIED REGION 

LLWVLLWA 
cctctctggctgaaggcctctccctgcctcctcacagCTTCTCTGGGTGCTGCTCTGGGCCA 
TVLGLLCQRLAARLGVVTGKD 
CCGTGTTGGGCTTGCTCTGCCAGCGACTGGCTGCACGTCTGGGCGTGGTGACAGGCAAGGAC 

LGEVCHLYYPK 
TTGGGCGAGGTCTGCCATCTCTACTACCCTAAGgtgagcttggggggcctggacagggagaa 



ccactggccccaaaccccaaacagccattttcagcttccacga 



V L W i_i I - c A - V v.- . • 

CGTCCTCTGGCTGACCATCGAGCTAGCCATTGTGGGCTCCGACATGCAGGAAGTCATCGGCA 
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TAIAFNLLSAGR 

CGGCCATTGCATTCAATCTGCTCTCAGCTGGACGgtaccaccccagtgtccccaactcttca 



ggcaggcagagaacagctgctgctacttccccccctaaccagtccctcccagagtctatttt 
atcctgctgtcccctctgaagcagctgctgccctgttttccagaaatgtaaagtgacttgtc 
taaagtcacacagatgtgagtcatgcaggaccccgggactgcag 



EXON 6+7 AMPLIFIED REGION 

I P L W G 

agacccctggtcctggctgggctgacccgggccactctggtttcagAATCCCACTCTGGGGT 

GVLITIVDTFFFLFLDNY 
GGCGTCCTCATCACCATCGTGGACACCTTCTTCTTCCTCTTCCTCGATAACTACGgtgggtg 
cacaccccacctcataggggagtggtggtggtgagggtgctgtactngggagaagggctctg 
acatcgaacagcctgggagcgcacctgagctccctcactctcccctgggtgcctctagcgag 

ttacttggacggctctcttcacctgtacatgggaaataatagcacagacttcagagggt 

32 bp- - 

atagccatacgatgtgatgtcacagattttttcgtggnttggtttaggtttggtttggttct 

GLRKLEAFFGLLITIMAL 
gctagtagGGCTGCGGAAGCTGGAAGCTTTTTTTGGACTCCTTATAACCATTATGGCCTTGA 

T F G Y E 

CCTTTGGCTATGAGg taggaagc cagtgc tgcaac c 



EXON 8 



ggaagc cagtgc tgcaaccccactgtggacctcccaagatcattcctctcccttccc tec tc 
tggccgcgggnnngggggggctggggtgggatggaggctgagaaatggtgaccgcggcg 

Y V V A R 

tggttgcgnggggcggggcttgtcctgaccaggctcctccctgcagTATGTGGTCGCCGTTC 

PEQGALLRGLFLPSCPGCGHP 

CTGAGCAGGGAGCGCTTCTTCGGGGCCTGTTCCTGCCCTCGTGCCCGGGCTGCGGCCACCCC 

ELLQAVGIVGAIIMPHNIYLH 
GAGCTGCTGCAGGCGGTGGGACTTGTTGGCGCCATCATCATGCCCCACAACATCTACCTGCA 
S A L V K 

CTCGGCCCTGGTCAAGgtgagcagaggggaggggaaagagaccccctcactcagtcggagcc 
atgctggctccgcctccaanntggagcccct 



EXON 9 

ctgcagtgagccatgcattgcaccacggcactccagtctgggtgacagaacaaaacctgtct 
ctaaaaaataaaataagtaagctggacacgtctgaggatggaacaaggtgagtgaaggagcg 
tgtcaggacc tgaggtagccagggacctcaaaggccagccttgcttcacccacacagtgctt 
acagtggtaaggcc tctgtggcaagaacagagatgtagaaaccatcggctgacctgaacctg 
cccagactgccacgcagggcacttaagaaggtactgggctttggggagaacatagaagtgtg 
agggtgggggacactgtggtgge tctgagggactttggcacttccctc tc 

A(JATGTAL'TTCUT(iATTGAGGCCACCATCGCCCTGTGCGTCTCCTTTATCATCAACCT'" mr r" 
VMAAFGQAFYQKTKQAA 
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GTCATGGGTGCATTTGQGCAGGCCTTCTACCAGAAAACCAAGCAGGCTGCGgtgagacacac 
tttcccccgcacctgaggccacacacgtactcatgtcctgtaagccttgccgaggaccctag 
gcaatgcagctgagcccttctgagtctctgccctgatgatcttccctgttggcagatatcat 
tea ttcagcaaataa teat tgagcatttgttatataccaagcacatcctagaccctggggat 
acagcagtcaatgctacaaagacccagctctctgcag 

EXON 10 

tttggaaccctggtcagtgctaggcagtccagtttcccaaggctgaggntgctcctcac 

FNICANSSL 
tcacatcttccttctactgccctggtacccacagTTCAACATCTGTGCCAACAGCAGCCTCC 
HDYAKIFPMNNATVAVDIYQG 
ACGACTACGCTAAGATCTTCCCCATGAACAACGCCACCGTGGCCGTGGACATTTACCAGGGG 

gtgagngngggtgggtggggagggcgtgacccagagaggcgcctcgggcagggccaccggtg 
gtaccacactcgtccctgcag 

EXON 11 

ccgtggcactttaccggggggtgagcgcgggtgggtggggagggcgtgacccagagaggctc 

G V I L G 

cccgcctcgggcagggccaccggtcctaccacactcgtccctgcagGGCGTGATCCTGGGCT 
CL F GPAALYIWAIGL LAAGQS 
GCCTGTTCGGCCCCGCGGCCCTCTACATCTGGGCCATAGGTCTCCTGGCGGCTGGGCAGAGC 

STMTGTYAGQFVME 
TCCACCATGACGGGCACCTACGCGGGACAGTTCGTGATGGAGgtagggcagggggcgggcca 

ggag 



EXON 12 

aaatgtt tagtc ttcagnaaccagctatgggatgggagttccccatttctccccacccatcc 
cctcttgccacctagggacagagctgtcccagttcaacagtggaaaaacagagcatgccccc 
agggataaatcggttgagggacatcagaggatctctcctctggaatccccagtcctgtctac 

GFLRLRWSSFA 
tcctcaccaaggagctcacccccaccccagGGCTTCCTGAGGCTGCGGTGGTCAAGCTTCGC 

RVL LT R S CA I LPTVLVAVFR 
CCGTGTCCTCCTCACCCGCTCCTGCGCCATCCTGCCCACCGTGCTCGTGGCTGTCTTCCGGG 
DLRDLSGLNDLLNVLQSLL 
ACCTGAGGGACTTGTCGGGCCTCAATGATCTGCTCAACGTGCTGCAGAGCCTGCTGgtga 



EXON 13 

LPVAVLPILTFTSMPTLMQ 
ccagCTCCCGGTTGCCGTGCTGCCCATCCTCACGTTCACCAGCATGCCCACCCTCATGCAGG 

E F A N G L 

AGTTTGCCAATGGCCTgtgagtaccccctttcccaagtgctggattgcatc 



TCCATCATGGTGCTAGTCTGCACCATCAACCTCTACTTCGTGGTCAGCTATCTGCCCAGCCT 
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PHPAYFGLAALLAAAYLGLS 
GCCCCACCCTGCCTACTTCGGCCTTGCAGCCTTGCTGGCCGCAGCCTACCTGGGCCTCAGCA 

T Y L 

CCTACCTGgtacagtagggccaggggatgccttgggaatggatga 



EXON 15 AMPLIFIED REGION 



tgccttgggaatggatgattccccagaggtcttggcatctccccaattcatggttgcccctc 
VWTCCLAHGATFLAHSSHH 
ccccagGTCTGGACCTGTTGCCTTGCCCACGGAGCCACCTTTCTGGCCCACAGCTCCCACCA 

HFLYGLLEEDHKGETSG* 
CCACTTCCTGTATGGGCCTCCTTGAAGAGGACCACAAAGGGGAGACCTCTGGCTAGGCCCAC 



ACCAGGGCTGGCTGGGGAGTGGCATGTATGACGT 
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TABLE 7 . 

Peak LOD scores for pairwise linkage analysis between NRAMP and 
previously mapped (Ref 37) 2q34 (TNP1) and 2q35 ( IL8RB , VIL1, 
DES) markers calculated for 36 Brazilian families. RF = 
recombination fraction (M=F) at which the peak LOD score was 
obtained. N = number of families contributing to the analysis. 



Marker intervals 


N 


Peak LOD Score 


RF 


TNP1 -NRAMP 


14 


10.49 


0 . 026 


TNP1 - IL8RB 


9 


6 . 02 


0 . 032 


TNP1-VIL1 


15 


9 . 84 


0.001 


TNP1-DES 


19 


11 .45 


0 . C46 


NRAMP- IL8RB 


11 


3 . 56 


0 . 072 


NRAMP-VIL1 


15 


10 . 94 


0.001 


NRAMP-DES 


20 


8 . 94 


0 . 051 


IL8RB-VIL1 


10 


5 . 80 


0 . 065 


IL8RB-DES 


12 


10 . 03 


0.035 


VIL1-DES 


14 


9 .47 


0 .059 
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CLAIMS : 

1. A natural resistance-associated macrophage protein having 
an N- terminal "region comprising an SH3 binding domain. 

2. A protein according to claim 1, wherein the N-terminal 
region further comprises one or more protein kinase C sites. 

3. A protein according to claim 2, wherein the N-terminal 
region has two protein kinase C sites which flank the SHJ binding 
domain . 

4. A protein according to any one of claims 1 to 3 , wherein the 
SH3 binding domain comprises the SH3 binding motif PGPAPQPXPXR . 

5. A protein according to claim 4, wherein the SH3 binding 
motif is PGPAPQPAPCR. 

6. A protein according to claim 4 or claim 5, wherein the SH3 
binding domain further comprises the polypeptide segment 
(S,A)PP(R,K)XSRPXXXS(I,V)XSX at the N-terminal end of the SH3 
binding motif. 

7. A protein according to claim 6, wherein the polypeptide 
segment is SPPRLSRPSYGSISSL . 

8. A natural resistance-associated macrophage protein 
comprising the mouse amino acid sequence shown in Figure 9, 
optionally with mutations or deletions which do not substantially 
affect the activity thereof. 
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10. A protein according to claim 9, wherein the SH3 binding 
motif is PTSPTSPGPQQAPPRET. 

11. A protein according to claim 9 or claim 10, wherein the SH3 
binding domain further comprises the polypeptide segment 
GPQRLSGSSYGSISS at the N-terminal end of the SH3 motif. 

12. A natural resistance-associated macrophage protein 
comprising the human amino acid sequence shown in Figure 9, 
optionally with mutations or deletions which do not substantially 
affect the activity thereof 

13 . A nucleotide sequence encoding a protein according to any 
one of the preceding claims. 

14 . A nucleotide sequence encoding a protein according to any 
one of claims 1 to 8 , wherein the SH3 binding domain of the 
protein is encoded by the sequence comprising 
CCTGGCCCAGCACCTCAGCCAGCGCCTTGCCGG. 

15. A nucleotide sequence according to claim 14, wherein the 
sequence encoding the SH3 binding domain further comprises the 
upstream region AGCCCCCCGAGGCTGAGCAGGCCCAGTTATGGCTCCATTT 
CCAGCCTG . 

16 . A nucleotide sequence encoding a protein according to any 
one of claims 1 to 3, or 9 to 12 , wherein the SH3 binding domain 
of the protein is encoded by the sequence comprising CCG ACC AGC 
CCG ACC ACC CCA GGG CCA CAG CAA GCA CCT CCC AGA GAG ACC. 

17. A nucleotide sequence according to claim 16, wherein the 
sequence encoding the SH3 binding domain further comprises the 
upstream region GGT CCC CAA AGG CTA AGC GGG TCC AGC TAT GGT TCC 
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17, which is a DNA sequence. 

19. A nucleotide sequence according to claim 18, which is a cDNA 
sequence . 

20. cDNA encoding natural resistance-associated macrophage 
protein, which comprises the nucleotide sequence shown in Figure 
2, optionally with mutations or deletions which do not 
substantially affect the activity of the protein. 



21. DNA encoding natural resistance-associated macrophage 
protein and comprising the following sequence, optionally with 
mutations or deletions which do not substantially atfect the 
activity of the protein: 

CCTGCGTCCTCATGATTAgtaatagcctccagggacctaatgggattcca 
gtgggggtgacttgagggggcaaggaagatttagggtctctgtgggggct 
cagctctgccagagcgacttcagtcaggcacttctgtggattaaacctgg 
tggaggagacagagcatgggggtcaagcagctgagcgaggggcctcctgt 
ctcacaaatctcctgactcaggggatttggattggagaagttctgttcct 
cactgggagggaagtgattcttggaacctctgcttggcacataggtggac 
ctgccagttgcggggagggaggtcgaggtcgtgggaggaggcaggtggct 
tgaatcccaggctcttgaaagaagcacacacccacctagcatcctggggt 
ccctgacagGTGACAAGAGCCCCCCGAGGCTGAGCAGGCCCAGTTATGGC 
TCCATTTCCAGCCTGCCTGGCCCAGCACCTCAGCCAGCGCCTTGCCGGGA 
GACCTACCTGAGTGAGAAGATCCCCATTCCCAGCGCAGACCAGgtaggga 
tggtaggaatgtcctcagtgcttcccaggtcctaccggatccgagctcgg 
accaag 300bp 

tcaagcttgagtgcatgtgcagctgtgtccattatagagcatgcgcgtgg 
aggtcagaggacaacttgtgagagtcagctcacctctactgcgtgggttc 
caactctggtggccttagcctctgagccccttctcggtccccattgccac 
actctaagcagattcctaggctgtcgggccaaaccctgaaatagagttga 
gtgactgagacctcagtggtccccagagagaagagcctgaagtatgagaa 
gggtctggggagggaagagctgtagcagggaggtttaattacaacaaggt 
ccccctcttagaactctgagaagcctgaaagaggcaggcaggtcatgtgc 



ggcati-gggtgggtga^ggdg.jcij.-.g.jgjL.j 
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atacctgctgagtttttagctgaggggatggtcaaggccagctgcatcca 
tccaggagctaacatgacccgatctgcttgcacccccagGGTACATTCAG 
CCTGAGGAAGCTGTGGGCGTTCACGGGGCCTGGTTTCCTCATGAGCATCG 
CTTTCCTTGACCCGGGAAACATTGAGTCCGACCTTCAAGCTGGCGCTGTG 
GCTGGGTTCAAAgtactgagtctgggccgccatgcttgctttgtggggag 
cactttccttagctaggacaggggagaccccagttttccagagccggctg 
catgggtggtttttctgaggataagctcctatcggggaggaaaaggaacc 
ttggagaaacccctggagaaaggatgctgtagggtgttagtcttcccgcc 
caatccccatcagagaggctgctctggctgagcatctcctctgtttcctc 
acagCTCCTCTGGGTGCTGCTCTGGGCCACTGTGCTAGGTTTGCTGTGCC 
AGCGGCTGGCTGCCCGGCT 

■ GGGCGTGGTGACAGGCAAGGACTTGGGTGAAGTCTGCCATCTCTACTACC 
CCAAGGTGCCCCGCATCCTCCTCTGGCTGACCATTGAGCTGGCCATTGTG 
GGCTCAGATATGCAGGAAGTCATCGGGACGGCTATCTCCTTCAATCTGCT 
CTCCGCTGGACGCATCCCGCTGTGGGGCGGTGTACTGATCACCATTGTGG 
ACACCTTCTTCTTCCTCTTCTTGGATAACTATGGTTTGCGCAAGCTGGAA 
GCTTTCTTCGGTCTCCTCATTACCATAATGGCTTTGACCTTCGGCTATGA 
GTATGTGGTAGCACACCCTTCCCAGGGAGCGCTCCTTAAGGGCCTGGTGC 
TGCCCACCTGTCCGGGCTGTGGGCAGCCCGAGCTGCTGCAGGCAGTGGGC 
ATCGTCGGTGCCATCATCATGCCCCATAACATCTACCTGCACTCAGCCTT 
GGTCAAGTCTAGAGAAGTAGACAGAACCCGCCGGGTGGATGTTCGAGAAG 
CCAACATGTACTTCCTGATTGAGGCCACCATCGCCCTATCGGTGTCCTTC 
ATCATCAACCTCTTTGTCATGGCTGTTTTTGGTCAGGCCTTCTACCAGCA 
AACCAATGAGGAAGCGTTCAACATCTGTGCCAACAGCAGCCTCCAGAACT 
ATGCTAAGATCTTCCCCAGGGACAATAACACTGTGTCAGTGGATATTTAT 
CAAGGAGGTGTGATCCTAGGCTGTCTCTTTGGCCCTGCGGCCCTCTACAT 
CTGGGCAGTAGGTCTCCTGGCAGCGGGGCAGAGTTCTACTATGACCGGCA 
CCTATGCAGGACAGTTCGTGATGGAGGGTTTCCTTAAGCTGCGGTGGTCC 
CGCTTCGCTCGGGTCCTTCTCACGCGCTCTTGCGCCATCCTGCCCACTGT 
GTTGGTGGCTGTCTTCCGAGACCTGAAGGACCTGTCCGGCCTCAACGATC 
TACTCAATGTTCTGCAGAGTCTACTGCTGCCCTTCGCTGTACTGCCCATT 
TTGACTTTCACCAGCATGCCAGCTGTCATGCAGGAGTTTGCCAACGGCCG 
GATGAGCAAAGCCATCACTTCGTGCATCATGGCGCTAGTCTGCGCCATCA 
ACCTGTACTTTGTGATCAGCTACCTGCCCAGCCTCCCGCACCCTGCCTAC 
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GTGCAGGGTTCCGGGTGACCGCGGCATCCAGCAAGCAAAGAGGCAACAGG 
GCAGACACAGCAGAGCAATTGGAGGTCCCCTACTGGCTTTCTGGATTACC 
GGTTTCCAGTTTGGACAAGTGCTTTACCTCGGAATAATGACACCATTCTT 
ATCACCACAACCTAAGAGACTTAAAAAACACAGTGCCTGGGGCGAGAGAT 
GGCTCAGGTGTGAAGAACACTAGCCACCACCCTTTCAGAAGATGGGGATT 
CAATTCCCAGCATCAACGTGGTGGCTTTCAACTGAAGGTGACTCCAGTTC 
CCAGAACACCTCAAACAGAACTGCCACAACTCCATTGTCTCACTCCAGCT 
CGTGGAAGATGAAGGGAGGAGTCCTAAAGAGTTCTAGGTCGGGTCTCTGG 
AGAGACGGCTCAGCTGTTAAGAGCACCCGACTGCTCTTCCAGAGGTCCTG 
AGTTCAATTCCCAGCAACCACATGGTGGCTCACAACCATCCATAATGGGA 
TCCCTCTTCTGGTGTGTCTGAAGACAACAACAGTGTCCTCACATATATAA 
AATAAATAAATCTTAAAAAAAAAAAAAAAAAAAAAAAACTCGAG 

22. A DNA sequence encoding a protein according to any one of 
claims 1 to 3 or 9 to 12, which sequence comprises one or more 
of the exons shown in Table 3, each of which is flanked by intron 
boundary regions . 

23. A nucleotide sequence comprising the promoter region of the 
nucleotide sequence according to any one of claims 13 to 22, 
which promoter region includes a poly gt site. 

24. A nucleotide sequence according to claim 23, wherein the 
poly gt site is of general formula t (gt) s ac (gt) 5 ac (gt) n g, in which 
n=o or an integer. 

25. A nucleotide sequence according to claim 13, which is an 
RNA. 

26. Plasmid pBabeXS . 1 incorporating cDNA according to claim 20. 

27. A retroviral vector construct incorporating a nucleotide 
sequence according to claim 19 or claim 20. 



WO 95/20044 



PCT/GB95/00095 



(58) 

least a portion thereof. 

29. A nucleotide primer pair according to claim 28, wherein the 
portion of the nucleotide sequence to be amplified is the poly 
gt site. 

30. A nucleotide primer pair capable of hybridising to a portion 
of the nucleotide sequence of any one of claims 13 to 24, which 
nucleotide sequence encodes the N-terminal region of the protein 
which comprises or is upstream of the SH3 binding domain. 

31. A nucleotide primer pair capable of hybridising to an exon 
as defined in claim 22, or the intron boundaries thereof, so as 
to permit amplification of at least a portion of the exon. 

32. A nucleotide primer pair according to claim 31, wherein the 
exon is exon 2 of the human NRAMP gene. 

33. A nucleotide primer pair according to claim 32, wherein the 
portion of the exon to be amplified comprises the sequence 
encoding the SH3 binding domain. 

34 . A nucleotide probe capable of hybridising to at least a 
portion of the nucleotide sequence of any one of claims 13 to 24, 
which nucleotide sequence encodes the N-terminal region of the 
protein which comprises or is upstream of the SH3 binding domain. 

35. A nucleotide probe according to claim 34, which comprises 
a cDNA sequence . 

36 . A nucleotide probe capable of hybridising to the nucleotide 
sequence according to claim 23 or claim 24, or to at least a 
portion of the DNA sequence according to claim 22 . 



WO 95/20044 



PCT/GB95/00095 



(59) 

38. A polypeptide fragment of a protein according to any one of 
claims 1 to 12, which comprises at least a portion of the N- 
terminal region . 

39. A polypeptide fragment of a protein according to any one of 
claims 1 to 12, which comprises an amino acid sequence selected 
from DKSPPRLSRPSYGSISS, PQPAPCRETYLSEKIPIP, 

GTFSLRKLWAFTGPGFLMSIAFLDPGNIESDLQ and WTCCIAHGATFLTHSSHKHFLYGL . 

40. An antibody to a protein according to any one of claims 1 
to 12. 

41. An antibody to a polypeptide fragment according to claim 3 8 
or claim 39. 

42. An antibody according to claim 40 or claim 41, which is a 
monoclonal antibody. 

43. Use of a primer pair according to any one of claims 28 to 
33, in a diagnostic test to detect a polymorphism in the NRAMP 
gene . 

44. Use of a probe according to any one of claims 34 to 37, in 
a diagnostic test to detect a polymorphism in the NRAMP gene. 
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1 GAATTCGGCACGAGGGAGCTAGTTGCCAGGCCTGGTGACCACACACAGAG 

MISDKS-PPRL10 
51 TATCCTGCCGCCTGCGTCCTCATGATTAGTGACAAGAGCCCCCCGAGGCT 

SRPSYGSISSLPGPAP 26 
101 GAGCAGGCCCAGTTATGGCTCCATTTCCAGCCTGCCTGGCCCAGCACCTC 

QPAPCRETYLSEKIPIP 43 
151 AGCCAGCGCCTTGCCGGGAGACCTACCTGAGTGAGAAGATCCCCATTCCC 

SADQGTFSLRKLWAFTG60 
201 AGCGCAGA CCAGGGTACATTCAGCCTGAGGAAGCTGTGGGCGTTCACGGG 

PGFLMSIAFLDPGNIE 76 
251 GCCTGGTTTCCTCATGAGCATCGCTTTCCTTGACCCGGGAAACATTGAGT 

S D L Q A G AVAGFKLL WVL 93 
301 CCGACCTTCAAGCTGGCGCTGTGGCTGGGTTCAAACTCCTCTGGGTGCTG 

LWATVLGLL CQRLAARL 110 
351 CTCTGGGCCACTGTGCTAGGTTTGCTU'l'GCCAGCGGCTGGCTGCCCGGCT 

GVVTG KDLGEVCHLYY 126 
401 GGGCGTGGTGACAGGCAAGGACTTGGGTGAAGTCTGCCATCTCTACTACC 

PKVPRILLWLTIELAIV 143 
4 51 CCAAGGTGCCCCGCATCCTCCTCTGGCTGACCATTGAGCTGGCCATTGTG 

GSDMQEVIGTAISFNLL 160 
501 GGCTCAGATATGCAGGAAGTCATCGGGACGGCTATCTCCTTCAATCTGCT 

S A G R I PLWGGVLITIV 176 
551 CTCCGCTGGACGCATCCCGCTGTGGGGCGGTGTACTGATCACCATTGTGG 

DTFFFLFL DNYGLRKLE 193 
601 ACACCTTCTTCTTCCTCTTCTTGGATAACTATGGTTTGCGCAAGCTGGAA 

AFFGLLITIMALTFGYE 210 
651 GCTTTCTTCGGTCTCCTCATTACCATAATGGCTTTGACCTTCGGCTATGA 

Y V V A HPSQGALLKGLV 226 
701 GTATGTGGTAGCACACCCTTCCCAGGGAGCGCTCCTTAAGGGCCTGGTGC 

LPTCPGCGQPE L L 0 A V G 243 
751 TGCCCACCTGTCCGGGCTGTGGGCAGCCCGAGCTGCTGCAGGCAGTGGGC 

IVGAI IMPHNIYL H S A L 260 
8 01 ATCGTCGGTGCCATCATCATGCCCCATAACATCTACCTGCACTCAGCCTT 

VKSREVDRTRRVDVRE 276 
851 GGTCAAGTCTAGAGAAGTAGACAGAACCCGCCGGGTGGATGTTCGAGAAG 

ANMYFLIEA TIALSVSF 293 
901 CCAAGATGTACTTCCTGATTGAGGCCACCATCGCCCTATCGGTGTCCTTC 

I INLFVMAVFGQA F Y Q Q 310 
951 ATCATCAACCTCTTTGTCATGGCTGTTTTTGGTCAGGCCTTCTACCAGCA 

TNEEAFNICANSSLQN 226 
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W A V G L L AAGQSSTMTG 376 
1151 CTGGGCAGTAGGTCTCCTGGCAGCGGGGCAGAGTTCTACTATGACCGGCA 

TYAGQFVMEGFLKLRWS 3 93 
1201 CCTATGCAGGACAGTTCGTGATGGAGGGTTTCCTTAAGCTGCGGTGGTCC 

R F A R VLLTRSCAILPTV 410 
1251 CGCTTCGCTCGGGTCCTTCTCACGCGCTCTTGCGCCATCCTGCCCACTGT 

L V A V F RDLKDLSGLND 426 
13 01 GTTGGTGGCTGTCTTCCGAGACCTGAAGGACCTGTCCGGCCTCAACGATC 

LLNVLOSLLLPFAVLPI 443 

13 51 TACTCAATGTTCTGCAGAGTCTACTGCTGCCCTTCGCTGTACTGCCCATT 

L T F TSMPAVMQEFANGR 460 

14 01 TTGACTTTCACCAGCATGCCAGCTGTCATGCAGGAGTTTGCCAACGGCCG 

M S K AITSCIMALVCAI 476 
14 51 GATGAGCAAAGCCATCACTTCGTGCATCATGGCGCTAGTCTGCGCCATCA 

NLYFVISYL P S L P H P A Y 493 
1501 ACCTGTACTTTGTGATCAGCTACCTGCCCAGCCTCCCGCACCCTGCCTAC 

FGLVALFAIGYLGLTAY 510 
1551 TTTGGCCTTGTGGCTCTGTTCGCAATAGGTTACTTGGGCCTGACTGCTTA 

LAW TCCIAHGATFLTH 526 
1601 TCTGGCCTGGACCTGTTGCATCGCCCACGGAGCCACCTTCCTGACCCACA 

S SHKH FLYGLPNEEQGG 543 
1651 GCTCCCACAAGCACTTCTTATATGGGCTCCCTAACGAGGAGCAGGGAGGC 

V Q G S G * 548 

17 01 GTGCAGGGTTCCGGGTGACCGCGGCATCCAGCAAGCAAAGAGGCAACAGG 
1751 GCAGACACAGCAGAGCAATTGGAGGTCCCCTACTGGCTTTCTGGATTACC 

18 01 GGTTTCCAGTTTGGACAAGTGCTTTACCTCGGAATAATGACACCATTCTT 
1851 ATCACCACAACCTAAGAGACTTAAAAAACAC AGTGCCTGGGGCGAGAGAT 
1901 GGCTCAGGTGTGAAGAACACTAGCCACCACC CTTTCAGAAGATGGGGATT 
1951 CAATTCCCAGCATCAACGTGGTGGCTTTCAACTGAAGGTGACTCCAGTTC 
2 0 0 1 CCAGAACACCTCAAACAGAACTGCCACAACTCCATTGTCTCACTCCAGCT 
2051 CGTGGAAGATGAAGGGAGGAGTCCTAAAGAGTTCTAGGTCGGGTCTCTGG 
2101 AGAGACGGCTCAGCTGTTAAGAGCACCCGACTGCTCTTCCAGAGGTCCTG 
2151 AGTTCAATTCCCAGCAACCACATGGTGGCTCACAACCATCCATAATGGGA 
2201 TCCCTCTTCTGGTGTGTCTGAAGACAACAACAGTGTCCTCACAT ATATAA 
2251 AATAAATAAATCTTAAAAAAAAAAAAAAAAAAAAAAAACT C GAG 
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HI 2 
1 CCTGCGTCCTCATGATTAgeaatagcctccagggacct aatgggattcca 

5 1 at ggggat aac ttc agggege a scgaaga t tt agggt etc t at ggagg ct 

101 cagctctgccagaccgacttcagtcaggcacttctgtggattaaacctgg 

151 togaogaga cagagc a t ocgcgr c aagcagctcagcgaggggcct cccgt 

201 ctcacaaatctcczgacrcaggggatttggattggagaacttctgttcct 

251 cacwgggagggaagtgatzcttcgaacctctgcttggcacataggtggac 

301 ctgccagttgcgccgagggaggtcgaggtcgtgggajggaggcaggtggct 

2 51 t a a at ccc a gg etc ttg a aagaagcacacacccac eta gcatcctooacc 

S D X S P ? ■ ?. L S R ? S ? G i_g 
401 ccctgacagGTGACAAGAGCCCCCCGAGGCTGAGCAGGCCCAGTTATGGC 

SISSL?G?APQ?A?CR 132 
451 TCCATTTCCAGCCTGCCTGGCCCAGCACCTCAGCCAGCGCCTTGCCGGGA 

TiTLSEKI PIPSADQ 47 

501 GACCTACCTGAGTG AG AAGATCCCCATTCCCAGCGCAG ACCAGgt aggga 

551 tggtaggaatgtcctcagtgcttcccaggtcctac-uuyatccgagczcgg 

601 accaag 300bp 

901 tcaagcttgagtgcatgt geagctgcgtccattat agagcatgcgcgtgg 

951 agctcagaggacaacttgtgagagt cagctcactt ctaccgcgtgggttc 

100 1 caactctggtggccttagcczctgagccccttctcggt ccccattgccac 



1051 actctaagcagatt cataggctgt cgggccaaaccctgaaatagagt tga 



1 1C 1 gt gactgag a cc t c ag t g gt c c cc agagagaagagee tga a gtat gag a a 



1151 gggtctggggaggga acagc-gtagcagggaggtttaattacaacaaggt 



1201 ccccccctrgggactctgsgaagcctgaaagaggcaggcaggtcatgtgc 



12 51 tgaccaaczgcagaggccgctgc-gaacaggaccaacccagaaagcagag 



1301 ccatagtgac-tcagcaaacggccctggtccctcgggggacgggcagcggt 



1351 ggcatxgggtgggtgatggaggacagggctggccagcctgactgaagaag 



1401 atacctgccgagt ttttagctgaggggatggtcaaggccagctgcatcca 

" G T F S 5 1 

1451 tccaagaoctaacatgacccgatctgcttgcacccccagGGTACATTCAG 

L *R K 1 W A f T C- ? G F L_ MSI 67 

ISC i c^tgaggT^agctgtgggcgttcacggggcctggtttcctcatgagcatcg 

AFLDPGNIESDLQAGAV 84 
155 1 CTTTCCTTGACCCGGGAAACATTGAGTCCGACCTTCAAGCTGGCGC7GTG 

A G F K 88 
160 1 GCTGGGTTCAAAgt act gag t ctgggccgccstgcttgctttgtggggac 
165 1 cactttccttagct aggacaggggagaccccagttttccagagccggctc 
1701 catqggtagtttttctgaggataagctcc-tatcggggaggaaaaggaacc 
175 1 ttagagaaacccctggagaaaggargccgt agggt at tsctctt cccgcc 
170 1 caatecccatcagacaggctgctctggctgagcatctcctctgtttcctc 

LLw'vLLVATVLGLLC 1C3 
1SS 1 acacCTCCTCTGGGTGCTGCTCTGGGCCACTGTGCTAGGTTTGCTGTGCC 

Q ?. 1 A 1 c " 
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CONSENSUS (4/9) 
X 
X 
X 

(S, A) 

D 

P 

(R, K) 
X 
S 
R 
p 

X 
X 
X 
5 

C, V) 
X 
5 

X 
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v • 
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N'RAM? CLONE S . 1 DERIVED SEQUENCE 

2. NIE-5PECIEIC REGULATORY PROTEIN 

3. GRANULINS PRECURSOR ( AC-ROGRAN I N ) 
FAX PROTEIN TYROSINE KINASE 
TEGUMENT PROTEIN UL4 9 
EETA-1 -ADRENERG IC RECEPTOR 
EXTENSIN ( PRO RICH GLYCOPROTEIN) 
E3NA-1 

DYNAMIN ( S H I 3 I RE PROTEIN) 



12/35 
13/38 
13/26 
12/23 
12/21 
14/30 
S/13 
11/20 



IDENTITIES 
IDENTITIES 
I DENT I TI E S 
IDENTITIES 
IDENTITIES 
IDENTITIES 
IDENTITIES 
IDENTITIES 



(34%) 
(34%) 
(50%) 
(52%) 
(57%) 
(46%) 
( 6 9 % ) 
(55%) 
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Sequence alignment of exon 2 from human (TO?) and mouse Nramp . 
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SEQUENCE IDENTITIES: nucleotide 115/147= 77% 

amino acid 30/44= 68% (82% with 
conserved substitutions) 
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Alignment of human Nrarup exon 2 with genbank amino acid database 
showing sequence identities and conserved substitutions (+) . 
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MOUSE NRAMP 
HUMAN NRAMP 

MICROTUBULE -ASSOCIATED PROTEIN 4 
CYTOKINE RECEPTOR COMMON BETA CHAIN 
PHOSPHOLIPASE C -BETA 3 
ZYXIN REGION I (162 - 178) 



IDENTITY TC HUMAN NRAMP 
72% (28/39) 

32% (14/43) 

45% (9/20) 

4 7 % (3/17) 

4 1 % (7/17) 
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exonl|exon2 PKC 

human "MTG DKGPQRLSGSSYG SISSPTSPTSPGPQQAPPRETYLSEK 42 

mouse MIS DKSPPRLSRPSYG SISSLPGPA PQPAPCRETYLSEK 3 9 

SMF1 MVNVGPSHAAVAVDASEARKIUvaSEE^ELRDKKDSTVVIEGEAPVRTFTSSSSNHEREr 6 0 

SMF2 MT--SQEYEPIQWSDESQTNNDSVNDAY ADVNTTHESRRRTTLQPNST 4 6 

PRC 1 

exon2 | exon3 exon3 | exon4 

human IPIPDTKPGTFSLRKLWAP"TGPGFLMSIAFLDPGNIESDLQLGPVAGFICT^WVLI.WATVL 102 

mouse IPIPSADGGTFSLRKLWAFTGPGFLMSIAFLDPGNIESDLOAGAVAGFKlJiWVLJiWATVL 9 9 

SMF1 TYVSKROVMRDIFAXiXKFIGPGLMVSVAYIDPGNYSTAVDAGASNOFSLLCIIIiLSNFI 12 0 

SMF2 SQSMIGTIJUCYARFIGPGIJWSVSYMDPGNYSTAVAAGSAHRYKLIiFSVLVSNFM 101 



human 
mouse 
SMF1 
SMF2 



exon4 | exon5 

GLLCQRLAARLGVVTGK33I/3EVCHLYYPKVPRTVIiWLTIEIJV.r^^ 162 

GLLCQRIAARLGVVTGKDI«GEVCHLYYPKVPR2ILiWLTIEI^^ 15 9 

AIFLQCLCIKIiGSVTGLDLSRACREYLPRWLJMTLYFFAECAVTATDIAEVIGTAIAliNI 18 0 

AAFWQYXCJUUjGAVTGI^l4AQNCKKHLPFGliNITIiYIIjAEllAIIATDLAEVVGTAISLiNI 161 



human 
mouse 
SMF1 
SMF2 



exon5 | exon6 exon6 | exon7 exon7 [ 

LSAGRIPLWGGVLITTVDTFFFLFLDNY GLRKLEAFFGLLITIKALTFGYE- 213 

LSAGRXPLWGGVLITIVETFFFLFLDNY GLRKLEAFFGLLTTIMALTFGYE- 210 

LI — KVPLPAGVAITWDVFLIMF- -TYKPGASSIRFIRIFECFVAVLWGVCICFAIEL 23 6 

LF — HIPI*ALGVILiTWDVLXVLL — AYK?NGS-HKGIRIFXAFV3I»LVVljTVVGrTVEL 216 



exonB exon8 | exon9 

humar. -YWARPEOGALLRGLFLPSCPGCGHPELLQAVGIVaAIlMPHNIYLHSALVKSR 2 67 

mouse -YWAHPSOGALLKGLVLPTCPGCGOPELLOAVGIVGAIIMPHNIYIjHSAX.VKSR 264 

SMF1 AYIPKSTSVKQVFRG-FVPSAOMFDfiNGIYTAISILGATVMPHSLFliGSAliVOPRLLDYl) 295 

SMF2 -FYAKI/3PAKEIFSG-FLPSKAWEGIX3LYLSI^U»TVMPHSLYI/;SGVVOPRLRirro 274 



human EIDRARRVDIREANM YFLI EATIALSVSF IINLFVMAAFGOAFY 311 

mouse EVDRTRRVDVREANM YF L I EATIALSVS F I INL FVMA VTGQAFY 308 

SMF1 VKHGNYWSDEQDKVKKSKSTEEIMEEKYFTTYRPTNAAIKYCMKYSMVELSITLFTIALF 35 5 

SMF2 IKNGHY-LPDAND MDNNHDNYRPSYEA1 SETLHFTITELLI SLFTVALF 322 



human 
mouse 
SMF1 
SMF2 



human 
mouse 
SMF1 
SMF2 



exon9|exonl0 exonlO | exonll 

QKTKQAAFNI CANSSLHDYAKI F PMNNATVAVD IYQGGVI LGCLFGPAALYIWAIGliAA 
QQTNEEAFOTCANSSLQNYAKIFPRDNNTVSaTDlY^ 

VN CAILWAG - STLYNS PE -ADGADLFTIHELLSRNLAPAAGTIFMLALLLS 

VN CAILXVSG-ATLYGSTQNAEFJUDLFSIYNLLCSTLSKGAGTVFVlALIiFS 

* 

7 



exonll I exonl2 

GQSSTMTGTYAGQFVMEGFLPJ,RWSSFARVLLTRSCAILPTVLVAVFRDLRDLSGIjroLL 
GQSSTMTGTYAGQFVKEGFLKLRWSFvFARVLLTRSCAZLPTVLVAVFRDLKDLSGLNIDLL 
GOSAGWCTMAGQrVSEGHINWKLOPWQRRIiATRCISZIPCLV"TSICIGREALSKALNAS 
G^SAGIVCTLSGQtWSEGFLWm^SPALRRSATRAVAITPCLILVLVAGRSGLSGALNAS 



371 
368 
405 

373 



431 
428 

465 
433 



exonl2 I exonI3 exonl3 | exonl4 

human NVLQSLLLPVAVLP ILTFTSMPTLMQ EFANGLLNKWTSS I 

mouse NVLQSLLLPFAVLP ILTFTSMPAVMQ EFANGRMSKAJTSC I 

SMF1 qvVLSIVIjPFLVAPLIFFTCKKSIMKTEITVDHTEEDSHNHQNNNDRSAGSVIEODGSSG 
SMF2 QWLSLLLPFVSAPLLYFTSSKKIMRVQLNRTKELSRTTDKKPVADRTEDD — etielee 



472 
469 
525 
^91 



excn!4 i exonlS 
hTv^VCTI^YT^SYLPSLPHPAYFGI^J^i-AAAY^ 

-■- — rr-f-vrr-- r- — vt/"t - * yt i vrr-"7 "-?.tflthsf 



mus= KJ-iFLYGLPNEECGGVQGSC- E4C 

SMF2 N--VYAIVQLG-MSHG2I; 5"5 FIGURE c 

£yrr- N--FYMLLGFT-TGKEVH2. 54 9 
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